Next Article in Journal
Robust Cost Volume Generation Method for Dense Stereo Matching in Endoscopic Scenarios
Previous Article in Journal
A Deep Learning Model for 3D Ground Reaction Force Estimation Using Shoes with Three Uniaxial Load Cells
Previous Article in Special Issue
Heart Rate Variability Based Estimation of Maximal Oxygen Uptake in Athletes Using Supervised Regression Models
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Sliding Scale Signal Quality Metric of Photoplethysmography Applicable to Measuring Heart Rate across Clinical Contexts with Chest Mounting as a Case Study

1
Department of Exercise Science, University of South Carolina, Columbia, SC 29208, USA
2
College of Engineering and Computing, University of South Carolina, Columbia, SC 29208, USA
*
Author to whom correspondence should be addressed.
Sensors 2023, 23(7), 3429; https://doi.org/10.3390/s23073429
Submission received: 30 January 2023 / Revised: 6 March 2023 / Accepted: 17 March 2023 / Published: 24 March 2023
(This article belongs to the Special Issue Sensors for Heart Rate Monitoring)

Abstract

:
Photoplethysmography (PPG) signal quality as a proxy for accuracy in heart rate (HR) measurement is useful in various public health contexts, ranging from short-term clinical diagnostics to free-living health behavior surveillance studies that inform public health policy. Each context has a different tolerance for acceptable signal quality, and it is reductive to expect a single threshold to meet the needs across all contexts. In this study, we propose two different metrics as sliding scales of PPG signal quality and assess their association with accuracy of HR measures compared to a ground truth electrocardiogram (ECG) measurement. Methods: We used two publicly available PPG datasets (BUT PPG and Troika) to test if our signal quality metrics could identify poor signal quality compared to gold standard visual inspection. To aid interpretation of the sliding scale metrics, we used ROC curves and Kappa values to calculate guideline cut points and evaluate agreement, respectively. We then used the Troika dataset and an original dataset of PPG data collected from the chest to examine the association between continuous metrics of signal quality and HR accuracy. PPG-based HR estimates were compared with reference HR estimates using the mean absolute error (MAE) and the root-mean-square error (RMSE). Point biserial correlations were used to examine the association between binary signal quality and HR error metrics (MAE and RMSE). Results: ROC analysis from the BUT PPG data revealed that the AUC was 0.758 (95% CI 0.624 to 0.892) for signal quality metrics of STD-width and 0.741 (95% CI 0.589 to 0.883) for self-consistency. There was a significant correlation between criterion poor signal quality and signal quality metrics in both Troika and originally collected data. Signal quality was highly correlated with HR accuracy (MAE and RMSE, respectively) between PPG and ground truth ECG. Conclusion: This proof-of-concept work demonstrates an effective approach for assessing signal quality and demonstrates the effect of poor signal quality on HR measurement. Our continuous signal quality metrics allow estimations of uncertainties in other emergent metrics, such as energy expenditure that relies on multiple independent biometrics. This open-source approach increases the availability and applicability of our work in public health settings.

1. Introduction

Accurate measurement of heart rate (HR) is crucial to inform health status metrics such as energy expenditure (EE) and chronic stress (i.e., heart rate variability). Abnormal HR patterns, such as elevated HR or low HR variability, are measurable manifestations of multisystem dysfunction that can be used to identify physiological responses to acute stress. This acute stress is then linked with unfavorable longer-term cardiometabolic outcomes [1,2]. Accurate HR measurement among free-living individuals is needed to advance the science of public health surveillance of factors related to chronic disease.
Electrocardiography (ECG) is a “gold standard” method of determining heart rate but is cumbersome in daily life settings, as it requires multiple leads that need to be changed daily. Photoplethysmography (PPG) is a convenient compact alternative technique for HR tracking. PPG works by recording blood volume changes using light reflection signals from the human body. PPG can be sampled at lower frequencies and therefore requires less energy than ECG, allowing longer battery life, an essential feature in wearable monitors [3]. An additional advantage of PPG over ECG is that it can be placed in any location where blood flow can be detected [4]. While many consumer wearable devices (i.e., Fitbit) have focused on collecting PPG signals from the wrist [5], alternative placements such as the chest may be preferable to reduce motion artifacts and reduce potential distraction [6].
PPG should be equivalent to ECG under ideal conditions [7,8] but appears to be inaccurate at high heart rates (>155) [9]. This is likely attributable in part to motion artifacts. Corruption of PPG signal by motion artifacts is a serious obstacle to the reliable use of PPG in ambulatory settings [10]. While removing motion artifacts from PPGs is critical, an essential first step is reliably detecting their presence, but few studies have focused on determining signal quality indices for PPG signals, especially in applied settings [11]. Currently, most methods focus on removal of motion artifacts, and there are few algorithms for simply detecting and quantifying PPG signal quality [12,13]. Breaking this step apart from artifact removal allows applied researchers more autonomy to decide which signals are deemed usable [14]. What applied researchers deem usable might vary by context. For example, PPG readings as part of a clinical stress test might require higher precision than a 14-day wear protocol for free-living individuals. Thus, a versatile signal quality indicator might have gradations of signal quality or a continuous quality score, which then allow researchers to tailor which signals are retained based on the precision needed. Reliable motion artifact detection techniques lay the foundation for a completely automated PPG data processing system that can identify PPG data frames contaminated with artifacts and further process them for motion artifact removal [10]. A first step in this process is identifying different degrees of poor signal quality and quantifying the impact of signal quality on accurate HR estimation.
Even for signals that would be considered poor by current research standards, there is significant information present within the signal that could enable extraction of reliable HR. However, most PPG enhancement research has focused on motion artifact removal techniques. Only some have examined the ability to recover quasi-periodic information from data corrupted with motion artifacts [10]. Methods to recover data include wavelet analysis and decomposition techniques [15] and adaptive filters [16]. However, many signal quality and denoising techniques are computationally intensive [17] and require data from accelerometers [18], which may not be present in PPG devices. Thus, there is a need for computationally efficient signal quality detection techniques that use only PPG signal data.
An additional limitation of existing signal quality and denoising literature is that most existing signal quality detection methods have only been tested on short time frame data (i.e., 2–3 s to 5 min) [19] and examined artificially (i.e., mechanical vibration) [20] or for minimal motion artifacts (i.e., finger tapping) [10]. Indeed, many algorithms are hard coded for specific data types and frequencies, which will likely limit their broadscale use [14] Even fewer algorithms have been developed and tested on original data [17]. Therefore, their application for measurement of free-living individuals is limited. As cited in several recent reviews in the literature [18,21], there is a need for the development of novel computationally efficient signal quality indicators that aid in the fidelity of vital parameter estimation, such as heart rate. Furthermore, there is a need to examine signal quality indicators from PPG signals collected from multiple different locations, including the chest.
Although there is a wealth of scientific literature on validation studies of PPG and processing techniques [17,21] this work is rarely adopted by applied research fields, including public health. A potential reason for this limitation is the siloed nature of study teams [22]. Engineering teams are mainly concerned with signal synchronization, while sports medicine tends to focus on standardized protocols with targeted groups. Clinicians, on the other hand, tend to favor real-time remote telemonitoring applications. Alternatively, applied public health researchers need devices capable of monitoring free living HR over routine monitoring time frames (>7 days). A more diverse team science approach, involving engineers, exercise physiologists and applied public health researchers may provide a more robust approach to this design and processing problem.
Therefore, the purpose of the current study is to describe a novel computationally efficient method to identify and quantify poor PPG signal quality. Metrics of signal quality are a critical first step to inform methods to recover signal information and ultimately produce accurate HR estimates. We use three separate datasets to calculate two continuous metrics of signal quality and examine the predictive value of the signal quality indices on HR measurement accuracy. Identifying signal quality falls within a larger method for a reduction of motion artifacts (ROMA) framework that aims to inform measurement of free-living HR outside the lab, where motion artifacts are a reality. The simplicity of the method has the potential to reduce computational time. The visual spectrogram analytics and open-source availability make this tool appealing to applied researchers, thereby overcoming a limitation of the field whereby advances in engineering are not readily adopted in applied public health research. This study is a unique contribution to the field for two main reasons: First, we aim to quantify signal quality and examine its impact on HR estimation, which is a foundational first step in motion artifact removal. Second, this study examines the signal quality of PPG signals collected from the chest, which may be a preferable location to collect PPG signal due to the reduction of motion artifacts and reduced distraction compared to wrist-based devices. The ability to detect poor PPG signal from chest-mounted PPG using open-source algorithms is a foundational first step toward designing novel open-source PPG devices that are ultimately adopted by health researchers to collect and process HR signal data from free-living individuals.

2. Materials and Methods

Three datasets were used in the current study, two existing datasets publicly available on PhysioNet [23] and one original dataset. The first dataset was the BUT PPG (Brno University of Technology Smartphone PPG) [24,25]. The BUT PPG is a dataset that contains a combination of clean PPG signals and PPG signals intentionally corrupted by motion artifacts. We used the BUT PPG dataset to determine if the ROMA metrics of signal quality were predictive of ground truth measures of signal quality. The second dataset we used was the TROIKA [26] dataset, which contains measures of HR using both PPG and ECG. The TROIKA dataset was used to determine if the metrics of signal quality were related to accuracy of PPG estimated HR (i.e., agreement between PPG HR and ECG HR). The third data set was original data collected by the study team at the University of South Carolina (UofSC). The third original dataset was used to examine the association between signal quality and HR accuracy from PPG collected from the chest. We used the original UofSC data to examine the associations between continuous ROMA metrics and HR accuracy and to examine the initial validity of cut points for ROMA signal quality metrics.
BUT PPG Dataset. The BUT PPG dataset [24,25] was created by the cardiology team at the Department of Biomedical Engineering, Brno University of Technology. It comprises 48 10 s recordings of PPGs and associated ECG signals used for determining reference HR. PPG data were collected by smartphone Xiaomi Mi9 (Xiaomi Inc., Beijing, China) with sampling frequency of 30 Hz. Reference ECG signals were recorded using a mobile ECG recorder (Bittium Faros 360, Bittium, Oulu, Finland) with a sampling frequency of 1000 Hz. Each PPG signal included annotation of quality and reference HR. Good and bad PPG signal quality was identified by expert visual inspection. PPG signal quality is rated using a binary criterion: 1 indicates good quality for HR estimation, 0 indicates signals where HR cannot be detected reliably, and thus these signals are unsuitable for any analysis. BUT PPG data were collected from 12 subjects (6 female, 6 male) aged between 21 to 61 years. Recordings were carried out between August 2020 and October 2020 [24,25].
TROIKA Dataset. The TROIKA dataset [26] consists of two-channel PPG signals collected on the wrist from individual trials from 12 male subjects between the ages of 18–35. Two pulse oximeters with green LEDs of wavelength 515 nm were embedded in a wristband, which was used to collect PPG signals sampled at 125 Hz. The ECG signal was recorded from the chest using wet ECG sensors. The 12 trials lasted a total of 5 min. Participants walked at 1–2 km/h for 0.5 min, then 6–8 km/h for 1 min, 12–15 km/h for 1 min, 6–8 km/h for 1 min, 12–15 km/h for 1 min and 1–2 km/h for the last 0.5 min.
UofSC Dataset. The UofSC dataset consisted of 19 stationary bike sessions completed by 11 individuals. Laboratory generated data allowed us to have control over the sources of motion artifacts and the duration of activity to ensure that a wide variety of HRs were recorded. The study was conducted in accordance with the Declaration of Helsinki, and the study protocol was approved by the University of South Carolina IRB in August 2021 (Pro00107610). Informed consent was obtained from all subjects involved in the study prior to data collection. All data collection took place in the Clinical Exercise Research Center Lab at the University of South Carolina. Participants in the UofSC dataset were 11 healthy adults (Age 20–42) with no known history of cardiovascular disease or abnormalities. Participants completed between 1 and 4 trials on separate days for a total of 19 trials. Participants had between 2 and 6 on the Fitzpatrick skin tone scale [27,28].
UofSC Biking Protocol. For the laboratory dataset, the PPG sensor was worn on the chest, attached using a polyester spunlace adhesive [29,30]. The PPG sensor, which uses green light to measure HR, was purchased from PulseSensor.com. This vendor provides all part numbers and circuit board schematics, enabling open-source reproduction and traceability of device performance. The sensor was powered by an open-source Arduino board, which was also used to collect the PPG sensor response, enabling a time stamp for the measured data to be collected. This time stamp enabled synchronization with ECG telemetry (Polar H10 monitor (Polar, Singapore) described below) measurements to within 1 s, the minimum reported time segment. While this particular PPG sensor is continuously monitoring, the combination of the Arduino sampling and transmission/reception results in an effective received sampling rate of 46.3 Hz (sample period Ts = 21.598 ms). This rate was determined using dummy time trials without a human subject to determine the number of samples received in 22 min, the length of the bike exercise protocol (including start up and stop times of one minute each). A Polar H10 chest strap heart rate ECG monitor was used as the comparison criterion (i.e., reference values) of HR. Polar monitors have been validated against the ECG gold standard [31].
All laboratory tests were performed indoors at 21 °C. For the protocol, subjects were asked to sit sedentary on the bike for the first 10 min to establish a consistent resting PPG signal. Then, subjects were asked to bike at a consistent speed of 50 RPM, which was monitored by an audible metronome. Participants biked at 50 RPM at moderate resistance for 2 min. For the next 3 min, the resistance was either increased by or maintained depending on the participant’s subjective exertion measured by the Borg perceived exertion scale [32]. After the 3 min, resistance was decreased, and participants were asked to rest for a final 5 min. For a total recording time of T = 20 min, the 46.3 Hz sampling rate yielded a data record of approximately N = 55,560 samples.

2.1. Signal Processing

The following steps were conducted for PPG signals from all three datasets (i.e., BUT PPG, TROIKA and UofSC biking data). The original sampled data are denoted by the sequence {xn}, for n = 1, 2…N. We also refer to sequences as vectors, via bold font, i.e., x.
Preliminary motion artifact removal: The collected PPG data was processed in Matlab. The first step was to remove slow non-periodic motion artifacts that are inevitably present in all measurements arising from breathing, sweating, adhesive tension changes, etc. This slow baseline drift from non-periodic motion artifacts was isolated by performing a moving mean over 0.6 s to smooth out and suppress the systolic and diastolic peaks, typically <0.3 s in duration (Figure 1a), while preserving the other motion artifacts. Mathematically, we create sequence yn via convolution of x with a 0.6 s (length L = 30 sample) moving average (MA) filter with impulse response (IR) h (=ones(1,L)/L): y = x × h. The MA filter has the well-known Dirichlet frequency response H(f) = sin(πLf/fs)/sin(πf/fs), which for our case has the first-null bandwidth of fs/30~1.543 Hz.
This baseline drift was then subtracted from the original signal to only show the systolic and diastolic peaks, which were now flat with respect to time, although their relative amplitude was not consistent over the trial (Figure 1b). The subtraction can be described as the creation of sequence v = y*q, where q = δ(n)h. Hence v is a high-pass filtered version of the original data x, V(f) = X(f)[1−H(f)]. This signal was then low-pass filtered at 3.5 Hz, i.e., 210 BPM to remove high frequency noise (e.g., from power lines), and smooth out the traces, while preserving the HR signal. The lowpass filter is a custom infinite IR (IIR) filter designed using Matlab’s “lowpass” function, with roll-off parameter specified by the steepness value of 0.8. At this point in the processing, we have sequence z = v*hlp, with hlp the IIR filter’s response.
The relative systolic peaks were then tracked using a moving maximum function (Figure 1b), producing sequence w = movingmax(v,0.5), where the functional operation employs a 0.5-s rectangular window. The HR amplitudes were then normalized, r = w/max(w), with the maximum taken over each 0.5-s interval; this yields HR signals that are the same amplitude (Figure 1c) to simplify beat counting in the time domain.
Time Domain HR Measurement: For time-domain measurements of HR through systolic–systolic spacings, the HR amplitudes need to be normalized. As noted, this was accomplished using a moving maximum operation to detect the systolic peak amplitude over 0.5 s windows. Given that HR amplitude does not change appreciably over short times, this normalization was reliable over the different test subjects with varying HR, giving asymmetric PPG traces typically varying between about –0.5 and +1, although some negative excursions were larger. Once the HR was normalized, the systolic peak height and duration were identified using a 0.5 peak height threshold. For the subjects in our study, this did not cause spurious diastolic detection, as the diastolic peaks were typically close to 0 or negative. From the difference between subsequent beats, the instantaneous HR was determined: specifically, we computed a sequence of periods {T0,k}, where k was an index, and the kth period was T0,k = peak(rk)−peak(rk−1), with the peak function selecting the maximum-valued sample within the kth 0.5 s window. This yields a sequence of HRs {HRk}, with HRk = 1/T0,k.
Outliers were removed from the HR sequence and replaced using Matlab’s “filloutliers” method, with a 40-beat moving median window, which removed points more than 3 local scaled MADs away from the local median. Outliers occurred through sharp jolts to the sensor due to poor mounting and will be described in the signal metrics section. These HRs were then averaged over 40 beats or ~30 s. The approximately 30 s window was chosen to be consistent with the 30 s time window used in the frequency domain determination of HR. While the average was over a 30 s window, a new HR in the time domain was computed for every beat, i.e., about once every second. A longer window smooths out variations due to signal noise from too short a window. This 30 s window also provides sufficient length to flag and average outliers in a robust fashion. Any outliers that were flagged were not used in the statistical calculations, with an estimated <0.1% of HR values being discarded. The discarded values were replaced with the value from the previous sample i.e., 21.6 ms before, too short a time for HR to change appreciably. Furthermore, for discarded values that were replaced, correlation with the frequency domain calculation using the full raw dataset described subsequently provided an additional check on accuracy. An example of poor quality PPG signal in the time domain is presented in Figure 2.
Frequency Domain HR Measurement: A good example of a high-quality PPG signal with slow motion artifact free signal is shown in Figure 3a (without the subsequent low pass filter), after conversion to a spectrogram with a 30 s time window in Matlab leading to an HR value ~7.27 s using Matlab’s default windowing and overlap parameters. This time resolution is short enough to be clinically valid [33], while being long enough to capture multiple heartbeats for assigning a reliable frequency in the spectral/frequency domain. The spectrogram was computed for the sequence v defined previously, yielding V(f,t) over 30 s intervals. This magnitude-squared short-time Fourier transform allows estimation of peak power at the HR frequency over time. There is also power in the second harmonic as in all non-sinusoidal periodic signals [34], although the intensity is much weaker than the fundamental frequency at which the HR lies. The peak fundamental frequency powers were normalized to 1 so that the slow changes in the HR amplitude over the course of the trial (e.g., Figure 1b) did not distract from the key metric (HR or frequency). The peak power frequency and spectral line width (shown in yellow on Figure 3a–c) were determined at each time in the spectrogram and converted to an HR in BPM by multiplying by 60. The HR vs. time in the frequency domain was then interpolated back to the systolic time stamps determined in the time domain above for ease of comparison between the two domains. Figure 3a shows an example of an HR spectrogram with good quality signal along with the corresponding ECG telemetry HR and extracted frequency domain PPG HR overlaid. A poor signal from poor mounting, i.e., the sensor’s contact with the skin being broken and reformed leads to “streakiness” of the spectral line or broadband interference from the impulsive nature of the contact/re-contact effects, as shown in Figure 3b.
In some trials, a well-behaved periodic motion artifact arose at ~100 BPM, 2× the 50 RPM cadence during the pedaling phase on the bicycle. This artifact was removed using a custom IIR notch filter at 100 BPM with a 50 BPM width to account for variations in pedaling during the trial. This was also sufficient not to interfere with the actual HR signal. Before filtering, it was clear that there were two peaks in the spectrogram as determined by visual inspection, enabling complete recovery of the correct signal. The emergence of these weak motion artifacts could be an indicator of marginal mounting, although further investigation is needed to clarify this. For trials that gave a very poor match with the Polar H10 ECG telemetry (Figure 3c), the streaky signal did not give recoverable data. In other words, if the loss of contact with the skin was too severe, the distance between skin was so large that no signal corresponding to the HR was obtained. The “streakiness” in this case in the frequency domain was due to an abrupt change in the baseline PPG signal, i.e., a sharp voltage impulse, the FFT of which was nearly white-noise-like [34]. The signal processing steps are summarized in Figure 4.
Reference Measure for Poor Signal Quality. In line with recent publications on motion artifact detection, we relied on expert human visual inspection to identify motion artifact corrupted data. Expert visual inspection is the current gold standard [19,35,36,37]. Visual inspection of UofSC data and TROIKA was conducted using spectrogram visualization plots (See Figure 3a–c, for example). The BUT PPG dataset included binary indicators of poor and good signal quality. Because the collected BUT PPG data records were only 10 s long, we used a sliding window of 2.5 s for frequency domain spectrogram calculation, which gives an HR value every 0.6 s using Matlab’s default windowing. While this short window length is not ideal for robust HR assignment, it was necessary due to the very short trials in the BUT PPG data.
ROMA Self-Consistency Signal Quality Index: Self-consistency (also known as HR frequency difference [38]) is defined as the difference between the fundamental frequency computed via the spectrogram V(f,t) and HR computed from the time domain peak calculation, the average HRk. This feature measures the agreement between the fundamental frequencies detected from the frequency spectrum and from the time-domain signal. It is assumed that the frequencies would be in agreement in a clean PPG segment. In a noise-corrupted segment, however, there could be large differences in the values. We computed the self-consistency metric as follows: The self-consistency between time domain and frequency domain is defined as the fraction of points that agreed to within 10 BPM, i.e., 1.94 times the 5 BPM limits of agreement of the threshold chosen.
ROMA Standard Deviation of Line Width (STD-Width) Signal Quality Index: Figure 3a,b show examples of spectrograms with good and bad agreement with the Polar telemetry reference values. In the “good” signal, the HR signal is sharp and well-defined in frequency, evidenced by the narrow yellow line in Figure 3a. As can be seen in Figure 3a, the width of this yellow spectral line does not change much, leading to a small standard deviation. In the “poor” signal (Figure 3b), the emergence of the interference streaks from loss of contact with the skin as described above leads to very wide yellow streaks when contact with skin is lost. Thus, the yellow line is dispersed throughout the spectrogram. When the abrupt change stabilizes, the yellow spectral line width (perhaps due to noise) becomes unpredictable until skin contact is re-established. This cyclical process leads to wide variations in the spectral line width due to poor mounting, giving a large standard deviation. Thus, the standard deviation in frequency of the spectrogram line is larger when there is poor signal quality, and the standard deviation of this line width is used as the ROMA STD-line with signal quality index.

2.2. Statistical Analysis

We conducted two broad sets of analyses: In Part 1, we examined signal quality agreement. In Part 2, we examined the impact of signal quality on HR accuracy.
Signal Quality Agreement: To assess signal quality agreement, for all 3 datasets we calculated point bi-serial correlation with 95% Bayes credible intervals (95% CI) between signal quality indicators (self-consistency and STD-width) and binary signal quality criterion values (good vs. bad per visual inspection). For the BUT PPG data, we also used ROC curves to identify area under the curve (AUC) and sensitivity/specificity for different values of self-consistency and STD-width compared with visual inspection signal quality (i.e., good/poor). We identified cut points that balanced both sensitivity and specificity, then applied them to the UofSC biking data. We used Kappa coefficients to examine agreement between signals identified as poor-quality determined using self-consistency and STD-width and gold standard visual spectrogram analysis. The Kappa statistic accounts for agreement expected by chance [39]. Kappa was interpreted based on the following scale described by Landis and Koch [40]: ≤0, poor agreement; 0.01–0.20, slight agreement; 0.21–0.40, fair agreement; 0.41–0.60, moderate agreement; 0.61–0.80, substantial agreement and 0.81–1.00, almost perfect agreement. We then conducted a binomial logistic regression to examine the unique and additive value of self-consistency and STD-width in predicting signal quality.
Associations Between Signal Quality and Heart Rate Accuracy: To examine the impact of signal quality on HR accuracy, we calculated root mean square error (RMSE), mean absolute error (MAE) and mean absolute percent error MAE(%) between calculated HR and ECG criterion heart rate. The formulas used for calculating these metrics are as follows:
RMSE = 1 n 1 i = 1 n ( x i y i ) 2
MAE = 1 n i = 1 n | x i y i |
MAE ( % ) = 1 n i = 1 n | x i y i | / x i
where x i and y i are the respective PPG and Polar estimated HR at the ith aligned time point. We also report accuracy, defined as the percentage of points within 5 bpm of the criterion. We then conducted Pearson correlations with 95% Bayes Credible intervals (95% CI) to examine the association between RMSE, MAE and the ROMA signal quality metrics of self-consistency and STD width.

3. Results

3.1. Part 1—Signal Quality Agreement

BUT PPG: Of the 48 observations in the BUT PPG dataset, 35 were marked as good quality. The remaining 13 were identified as poor-quality per the reference visual inspection criterion. Self-consistency was correlated with the binary signal quality indicator (r = 0.33, 95%CI 0.76 to 0.56) but STD-width was not (r = −0.15, 95%CI −0.41 to 0.12). ROC analysis from the BUT PPG data revealed that the AUC was 0.758 (95% CI 0.624 to 0.892) for STD-width and 0.741 (95% CI 0.589 to 0.883) for self-consistency. Based on optimal balance of sensitivity/specificity, we identified a cut off score of >30 for self-consistency (sensitivity = 0.615/specificity 0.80) and <10 for STD-width (sensitivity = 0.923; Specificity =0.571).
Using the identified cut-off scores for STD-width < 10, 27 of the BUT PPG observations were identified as poor quality. Using the self-consistency cut off score > 30, 14 observations were identified as poor quality. A forward stepwise binary logistic regression model revealed that STD-width < 10 was a significant predictor of signal quality and explained 30% (Nagelkerke R2) of the variance in signal quality and correctly classified 73% of cases. Self-consistency > 30 did not add significant predictive value beyond STD-width and thus did not meet criteria to be entered in the logistic regression model.
UofSC Biking Protocol: In the 19 sessions of the UofSC biking dataset, self-consistency for PPG signals ranged from 12.05 to 98.26, and STD-width ranged from 3.05 to 12.82 (see Table 1). There was a strong correlation between the visual inspection criterion of signal quality and self-consistency (r = 0.69, 95%CI 0.44 to 0.89) and STD-width (r = −0.64, 95% CI −0.87 to −0.36). Using the cut points identified above, 6 of 19 observations were identified as poor quality. There was substantial agreement [40] (Kappa = 0.872) between signals identified as poor quality using the criterion of visual spectrogram and both self-consistency > 30 and STD-width < 10 metrics. There was perfect collinearity (r = 1.00) between STD-width and self-consistency, thus logistic regressions could not be conducted.
TROIKA: Of the 12 sessions, self-consistency for PPG signals ranged from 32 to 90, and STD-width ranged from 5.7 to 20.6. The criterion of visual spectrogram was correlated with both STD-width (r = −0.50, 95%CI −0.85 to −0.08) and self-consistency (r = 0.62, 95%CI 0.24 to 0.91). Using the identified cut points for STD-width < 10, 5 of the TROIKA observations were identified as poor quality. Using the self-consistency cut off score > 30, 0 observations were identified as poor quality. There was substantial agreement [40] (Kappa = 0.633) between signals identified as poor quality using visual spectrogram analysis and STD-width < 10. Because 100% of the signals were deemed high quality per the self-consistency metric, Kappa was not able to be calculated.

3.2. Part 2—Associations between Signal Quality and Heart Rate Accuracy

TROIKA: The overall correlation between signal quality and HR accuracy between PPG and the ground truth of ECG (i.e., RMSE) was r = 0.56 (95%CI 0.17 to 0.90) for STD-width (see Figure 5a) and r = −0.38 (95%CI −0.81 to 0.08) for self-consistency (see Figure 5b). Similarly, MAE was positively correlated with STD-width r = 0.53 (95%CI 0.12 to 0.87) and negatively correlated with self-consistency r = −0.30 (95%CI −0.76 to 0.18).
UofSC Biking Data: Individual trial level accuracy, self-consistency, STD-width and errors (RMSE, MAE) are presented in Table 1. Aggregated averages stratified by signal quality are presented in Table 2. T-test revealed significant differences in terms of accuracy between protocols identified as poor (N = 5) and adequate signal quality (n = 14) using the binary cut points of self-consistency > 30 and STD-width < 10 (See Table 2).
Signal quality was highly correlated with HR accuracy (MAE and RMSE, respectively) between PPG and the ground truth ECG Polar HR. Across all participants, the overall correlation between signal quality and HR accuracy between PPG and the ground truth of ECG (i.e., RMSE) was r = 0.77 (95%CI 0.57 to 0.92) for STD-width (see Figure 5a) and r = −0.73 (95%CI −0.91 to −0.51) for self-consistency (see Figure 5b). Similarly, MAE was positively correlated with STD-width r = 0.78 (95%CI 0.59 to 0.93) and negatively correlated with self-consistency r = −0.69 (95%CI −0.90 to −0.46).
Performance Comparison with Other Works: We compared our metrics of agreement (kappa) and association (correlation) with previous signal quality identification works which presented either kappa or correlation statistics (See Table 3).
Neshitov et al. [41] also examined the TROIKA dataset for corrupted signal using wavelet transformation and found similar rates of poor quality signals ~40%. The current study and signal discarding ratio and self-consistency were highly correlated (r = −0.742).
Table 3. Signal quality identification: performance comparison with other works.
Table 3. Signal quality identification: performance comparison with other works.
TrialsTime (s)NKappaCorrelation
Sukor [42]10460130.64-
Orphanidou [14]1500107-0.86
ROMA—UofSC Bike191320110.870.69
ROMA—Troika1230010.630.62

4. Discussion

The purpose of this study was to describe a novel computationally efficient method to identify and quantify poor signal PPG quality. This is a necessary first step to recover signal information to produce accurate HR estimates. We demonstrated an effective method to identify poor signal PPG quality in both existing and original data and showed that signal quality is associated with HR accuracy. Identifying poor PPG signal is a critical first step before signal recovery methods can be used to ultimately produce accurate HR estimates. Both self-consistency and STD-width were associated with reference measures of signal quality. The new signal quality metrics were then associated with the accuracy of HR measured by PPG compared to an ECG in both existing and original data. Signal quality validity was evidenced by the strong correlation between signal quality and HR agreement between reference measures of HR (i.e., Polar telemetry and ECG) and PPG produced HR estimates. These findings indicate that poor signals are indeed producing inaccurate estimates of HR. While existing studies in this area suggest that, in short durations, PPG signals can produce accurate estimations of HR [19], this evidence is based on signals that were not collected from free-living individuals and included activities that have limited motion artifacts. These studies then have limited utility in applied research settings where motion artifacts are a reality. If advances in engineering and signal processing aim to have a public health impact, they need to overcome challenges including motion artifacts.
Continuous measures of signal quality are needed to accurately distinguish valid HR measures in wearable devices. Current consumer wearable devices do not allow for open-source processing, and thus metrics are fundamentally unverifiable. This is especially worrisome in consumer wearables, which are some of the most used measures of PA in published studies, clinical trials and NIH-funded research [43]. However, similar concerns also exist among research-grade devices that use PPG, such as Empatica E4 and Biovotion Everion, given that the manufacturers prevent access to raw data. Thus, while devices will produce an HR estimate, the trustworthiness of that estimate is unknown.
The open-source metrics of signal quality described in this study can be used in future PPG devices that aim to measure HR in free-living settings. Ideally, such devices should be capable of measuring multiple vital parameters, and this is an underdeveloped area according to Biswas et al. in their recent review [21] Further refinement should lead to the measurement of other hemodynamic markers through PPG, such as pulse wave velocity and augmentation index [44], both of which have high potential utility as health indicators [45,46]. Usage of these markers can provide a more feasible alternative to existing measures of blood pressure and pulse wave analysis, which require higher patient burden [44]. Additionally, further work needs to examine the effectiveness of the ROMA method in diverse populations across developmental stages and in settings that have ecologically relevant motion artifacts. It is worth noting that although we presented metrics of STD-width and self-consistency using a binary criterion of visual inspection (good vs. bad), statistically, it is usually preferable to work with the original continuous variables [47]. Indeed, using the continuous measure would allow applied researchers more flexibility over the minimal degree of signal quality deemed acceptable. Thus, while we present general guidelines for binary determination of signal quality, these are only intended to function as guidelines. Future research should aim to examine longer and more diverse PPG signals to examine the association between continuous signal quality and accuracy of HR measurement.
Our study provides a computationally non-intensive method of estimating continuous signal quality from PPG collected from the chest. This is a foundational first step in the future of open-source signal processing. This finding also has high clinical utility for applied health researchers. Devices that collect PPG from the chest may be especially relevant for cardiac monitoring of children, as existing wrist-based wearable monitors may be uncomfortable or distracting for small children, especially in free-living conditions where children are asked to wear devices over multiple days. A downfall in the field is that advances in engineering are not readily adopted in public health research. Therefore, the next steps in this process are to use the metric to identify signal quality, remove signal noise and then recover usable data. From here, HR processing using the frequency domain can potentially salvage poor signal data. These metrics will inform the processing of data from a completely open-source wearable device designed to measure HR using chest-mounted PPG signal.
Study results should be interpreted in the context of their limitations. While our sample size is consistent with the existing literature [21], we only included 11 individuals in our study. Although this sample provided thousands of data points, it is challenging to generalize and compare these results to the larger population. Our study sample comprised a relatively homogenous group, consisting of mostly healthy, active, White individuals. While the evidence regarding the impact of skin tone on PPG signal quality appears limited [48], the magnitude of this effect on the population level across health metrics is still unknown [49]. To overcome these limitations, we used two additional publicly available datasets to supplement our results. It is necessary for such research to be open-source and accessible to researchers across domains. We can improve the synergy between basic and applied scientific fields by developing and using open-source research-grade devices to gather raw signal data and then sharing that data publicly using services such as PhysioNet [23]. With more data available, the ROMA method to identify poor signals can be further validated in more diverse populations and age groups.

5. Conclusions

Poor PPG signal appears to produce inaccurate estimates of HR. The approach developed in the current study allows for two continuous measures of signal quality, which can then be used to decide if functional information still exists in the signal, if measurements should be discarded or if the results can be interpreted with caution. The level of acceptable PPG signal quality may be dependent on the ultimate use of the device. Therefore, there is a need for a collaboration between engineering and public health researchers to continually develop and refine methods to measure and assess markers of individual and population level health. By creating a fully verifiable and easy to implement method of open-source processing, the scientific community can leverage team science and joint innovation across disciplines to ultimately improve measurements of HR which have applied utility in multiple settings, including medical contexts and public health.

Author Contributions

Conceptualization: B.A., M.V.S.C. and R.G.W.; methodology: B.A., M.V.S.C. and R.G.W.; software: M.V.S.C., J.M. and B.S.; validation: B.A., B.S., M.V.S.C. and R.G.W.; formal analysis: B.A., M.V.S.C. and R.G.W.; data curation, M.K.M., H.P. and M.T.S.; writing—original draft preparation, B.A., M.K.M. and M.V.S.C.; writing—review and editing, S.B., D.W.M., R.G.W., A.L. and M.K.M.; funding acquisition, B.A., M.V.S.C., R.G.W. and A.L. All authors have read and agreed to the published version of the manuscript.

Funding

Work on this article was supported in part by the National Institute of General Medical Sciences of the National Institutes of Health for the UofSC Research Center for Child Well-Being under Award Number P20GM130420. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. Work on this article was supported in part by the National Institute of Diabetes and Digestive and Kidney Diseases under Award Number R21DK131387.

Institutional Review Board Statement

The study protocol was conducted according to the guidelines of the Declaration of Helsinki, approved by the University of South Carolina IRB (Pro00107610). Informed consent was obtained from all subjects involved in the study.

Informed Consent Statement

The study was conducted in accordance with the Declaration of Helsinki. The study protocol was approved by the University of South Carolina IRB in August 2021 (Pro00107610). Informed consent was obtained from all subjects involved in the study prior to data collection.

Data Availability Statement

Data and corresponding processing code are publicly available on PhysoNet at Github, respectively https://github.com/ACOI-UofSC/Bike_Protocol.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Joyner, M.J. Preclinical and clinical evaluation of autonomic function in humans. J. Physiol. 2016, 594, 4009–4013. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  2. Godoy, L.C.; Frankfurter, C.; Cooper, M.; Lay, C.; Maunder, R.; Farkouh, M.E. Association of Adverse Childhood Experiences with Cardiovascular Disease Later in Life: A Review. JAMA Cardiol. 2021, 6, 228–235. [Google Scholar] [CrossRef] [PubMed]
  3. Wang, G.; Atef, M.; Lian, Y. Towards a continuous non-invasive cuffless blood pressure monitoring system using PPG: Systems and circuits review. IEEE Circuits Syst. Mag. 2018, 18, 6–26. [Google Scholar] [CrossRef]
  4. Castaneda, D.; Esparza, A.; Ghamari, M.; Soltanpur, C.; Nazeran, H. A review on wearable photoplethysmography sensors and their potential future applications in health care. Int. J. Biosens. Bioelectron. 2018, 4, 195. [Google Scholar]
  5. Sana, F.; Isselbacher, E.M.; Singh, J.P.; Heist, E.K.; Pathik, B.; Armoundas, A.A. Wearable devices for ambulatory cardiac monitoring: JACC state-of-the-art review. J. Am. Coll. Cardiol. 2020, 75, 1582–1592. [Google Scholar] [CrossRef]
  6. Creaser, A.V.; Clemes, S.A.; Costa, S.; Hall, J.; Ridgers, N.D.; Barber, S.E.; Bingham, D.D. The Acceptability, Feasibility, and Effectiveness of Wearable Activity Trackers for Increasing Physical Activity in Children and Adolescents: A Systematic Review. Int. J. Environ. Res. Public Health 2021, 18, 6211. [Google Scholar] [CrossRef]
  7. Nakajima, K.; Tamura, T.; Miike, H. Monitoring of heart and respiratory rates by photoplethysmography using a digital filtering technique. Med. Eng. Phys. 1996, 18, 365–372. [Google Scholar] [CrossRef]
  8. Lu, G.; Yang, F.; Taylor, J.; Stein, J. A comparison of photoplethysmography and ECG recording to analyse heart rate variability in healthy subjects. J. Med. Eng. Technol. 2009, 33, 634–641. [Google Scholar] [CrossRef]
  9. Weiler, D.T.; Villajuan, S.O.; Edkins, L.; Cleary, S.; Saleem, J.J. Wearable heart rate monitor technology accuracy in research: A comparative study between PPG and ECG technology. In Proceedings of the Human Factors and Ergonomics Society Annual Meeting, Austin, TX, USA, 9–13 October 2017; SAGE Publications: Los Angeles, CA, USA, 2017; Volume 61, pp. 1292–1296. [Google Scholar]
  10. Krishnan, R.; Natarajan, B.; Warren, S. Two-stage approach for detection and reduction of motion artifacts in photoplethysmographic data. IEEE Trans. Biomed. Eng. 2010, 57, 1867–1876. [Google Scholar] [CrossRef] [Green Version]
  11. Elgendi, M. Optimal signal quality index for photoplethysmogram signals. Bioengineering 2016, 3, 21. [Google Scholar] [CrossRef] [Green Version]
  12. Pradhan, N.; Rajan, S.; Adler, A.; Redpath, C. Classification of the quality of wristband-based photoplethysmography signals. In Proceedings of the 2017 IEEE International Symposium on Medical Measurements and Applications (MeMeA), Rochester, MN, USA, 7–10 May 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 269–274. [Google Scholar]
  13. Li, H.; Huang, S. A High-Efficiency and Real-Time Method for Quality Evaluation of PPG Signals. In Proceedings of the 2019 International Conference on Optoelectronic Science and Materials, Hefei, China, 20–22 September 2019; IOP Conference Series: Materials Science and Engineering; IOP Publishing: Bristol, UK, 2020; Volume 711, p. 012100. [Google Scholar]
  14. Orphanidou, C. Quality Assessment for the photoplethysmogram (PPG). In Signal Quality Assessment in Physiological Monitoring: State of the Art and Practical Considerations; Springer: Berlin/Heidelberg, Germany, 2018; pp. 41–63. [Google Scholar]
  15. Lee, S.; Ibey, B.L.; Xu, W.; Wilson, M.A.; Ericson, M.N.; Cote, G.L. Processing of pulse oximeter data using discrete wavelet analysis. IEEE Trans. Biomed. Eng. 2005, 52, 1350–1352. [Google Scholar] [CrossRef] [PubMed]
  16. Graybeal, J.; Petterson, M. Adaptive filtering and alternative calculations revolutionizes pulse oximetry sensitivity and specificity during motion and low perfusion. In Proceedings of the 26th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, San Francisco, CA, USA, 1–5 September 2004; IEEE: Piscataway, NJ, USA, 2004; Volume 2, pp. 5363–5366. [Google Scholar]
  17. Kumar, A.; Komaragiri, R.; Kumar, M. A Review on Computation Methods Used in Photoplethysmography Signal Analysis for Heart Rate Estimation. Arch. Comput. Methods Eng. 2022, 29, 921–940. [Google Scholar]
  18. Kumar, A.; Komaragiri, R.; Kumar, M. Reference signal less Fourier analysis based motion artifact removal algorithm for wearable photoplethysmography devices to estimate heart rate during physical exercises. Comput. Biol. Med. 2022, 141, 105081. [Google Scholar]
  19. Dao, D.; Salehizadeh, S.M.; Noh, Y.; Chong, J.W.; Cho, C.H.; McManus, D.; Darling, C.E.; Mendelson, Y.; Chon, K.H. A robust motion artifact detection algorithm for accurate detection of heart rates from photoplethysmographic signals using time—Frequency spectral features. IEEE J. Biomed. Health Inform. 2016, 21, 1242–1253. [Google Scholar] [CrossRef] [PubMed]
  20. Reddy, K.A.; George, B.; Kumar, V.J. Use of fourier series analysis for motion artifact reduction and data compression of photoplethysmographic signals. IEEE Trans. Instrum. Meas. 2008, 58, 1706–1711. [Google Scholar] [CrossRef]
  21. Biswas, D.; Simões-Capela, N.; Van Hoof, C.; Van Helleputte, N. Heart rate estimation from wrist-worn photoplethysmography: A review. IEEE Sens. J. 2019, 19, 6560–6570. [Google Scholar] [CrossRef]
  22. Sartor, F.; Papini, G.; Cox, L.G.E.; Cleland, J. Methodological shortcomings of wrist-worn heart rate monitors validations. J. Med. Internet Res. 2018, 20, e10108. [Google Scholar] [CrossRef]
  23. Goldberger, A.L.; Amaral, L.A.; Glass, L.; Hausdorff, J.M.; Ivanov, P.C.; Mark, R.G.; Mietus, J.E.; Moody, G.B.; Peng, C.-K.; Stanley, H.E. PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation 2000, 101, e215–e220. [Google Scholar] [CrossRef] [Green Version]
  24. Nemcova, A.; Smisek, R.; Vargova, E.; Maršánová, L.; Vitek, M.; Smital, L. Brno University of Technology Smartphone PPG Database (BUT PPG). PhysioNet 2021, 101, e215–e220. [Google Scholar] [CrossRef]
  25. Nemcova, A.; Vargova, E.; Smisek, R.; Marsanova, L.; Smital, L.; Vitek, M. Brno University of Technology Smartphone PPG Database (BUT PPG): Annotated Dataset for PPG Quality Assessment and Heart Rate Estimation. BioMed Res. Int. 2021, 2021, 3453007. [Google Scholar] [CrossRef]
  26. Zhang, Z.; Pi, Z.; Liu, B. TROIKA: A general framework for heart rate monitoring using wrist-type photoplethysmographic signals during intensive physical exercise. IEEE Trans. Biomed. Eng. 2014, 62, 522–531. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  27. Sachdeva, S. Fitzpatrick skin typing: Applications in dermatology. Indian J. Dermatol. Venereol. Leprol. 2009, 75, 93. [Google Scholar] [CrossRef] [PubMed]
  28. Puranen, A.; Halkola, T.; Kirkeby, O.; Vehkaoja, A. Effect of skin tone and activity on the performance of wrist-worn optical beat-to-beat heart rate monitoring. In Proceedings of the 2020 IEEE SENSORS, Rotterdam, The Netherlands, 25–28 October 2020; pp. 1–4. [Google Scholar]
  29. Oppel, E.; Kamann, S.; Reichl, F.X.; Högg, C. The Dexcom glucose monitoring system—An isobornyl acrylate-free alternative for diabetic patients. Contact Dermatitis 2019, 81, 32–36. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  30. Massa, G.G.; Gys, I.; Op‘t Eyndt, A.; Bevilacqua, E.; Wijnands, A.; Declercq, P.; Zeevaert, R. Evaluation of the FreeStyle® Libre flash glucose monitoring system in children and adolescents with type 1 diabetes. Hormone Res. Paediatr. 2018, 89, 189–199. [Google Scholar] [CrossRef] [PubMed]
  31. Schaffarczyk, M.; Rogers, B.; Reer, R.; Gronwald, T. Validity of the polar H10 sensor for heart rate variability analysis during resting state and incremental exercise in recreational men and women. Sensors 2022, 22, 6536. [Google Scholar] [CrossRef]
  32. Chen, M.J.; Fan, X.; Moe, S.T. Criterion-related validity of the Borg ratings of perceived exertion scale in healthy individuals: A meta-analysis. J. Sports Sci. 2002, 20, 873–899. [Google Scholar] [CrossRef]
  33. Yu, C.; Liu, Z.; McKenna, T.; Reisner, A.T.; Reifman, J. A method for automatic identification of reliable heart rates calculated from ECG and PPG waveforms. J. Am. Med. Inform. Assoc. 2006, 13, 309–320. [Google Scholar] [CrossRef]
  34. Ulaby, F.T.; Maharbiz, M.M. Circuits; NTS Press: Glasgow, UK, 2010. [Google Scholar]
  35. Chong, J.W.; Dao, D.K.; Salehizadeh, S.; McManus, D.D.; Darling, C.E.; Chon, K.H.; Mendelson, Y. Photoplethysmograph signal reconstruction based on a novel hybrid motion artifact detection–reduction approach. Part I: Motion and noise artifact detection. Ann. Biomed. Eng. 2014, 42, 2238–2250. [Google Scholar] [CrossRef]
  36. Selvaraj, N.; Mendelson, Y.; Shelley, K.H.; Silverman, D.G.; Chon, K.H. Statistical approach for the detection of motion/noise artifacts in Photoplethysmogram. In Proceedings of the 2011 Annual International Conference of the IEEE Engineering in Medicine and Biology Society, Boston, MA, USA, 30 August–3 September 2011; pp. 4972–4975. [Google Scholar]
  37. Krishnan, R.; Natarajan, B.; Warren, S. Analysis and detection of motion artifact in photoplethysmographic data using higher order statistics. In Proceedings of the 2008 IEEE International Conference on Acoustics, Speech and Signal Processing, Las Vegas, NV, USA, 31 March–4 April 2008; pp. 613–616. [Google Scholar]
  38. Yan, Y.-S.; Poon, C.C.; Zhang, Y.-T. Reduction of motion artifact in pulse oximetry by smoothed pseudo Wigner-Ville distribution. J. Neuroeng. Rehabil. 2005, 2, 3. [Google Scholar] [CrossRef] [Green Version]
  39. Cohen, J. A coefficient of agreement for nominal scales. Educ. Psychol. Meas. 1960, 20, 37–46. [Google Scholar] [CrossRef]
  40. Landis, J.R.; Koch, G.G. The measurement of observer agreement for categorical data. Biometrics 1977, 33, 159–174. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  41. Neshitov, A.; Tyapochkin, K.; Smorodnikova, E.; Pravdin, P. Wavelet analysis and self-similarity of photoplethysmography signals for HRV estimation and quality assessment. Sensors 2021, 21, 6798. [Google Scholar] [CrossRef] [PubMed]
  42. Sukor, J.A.; Redmond, S.; Lovell, N. Signal quality measures for pulse oximetry through waveform morphology analysis. Physiol. Meas. 2011, 32, 369. [Google Scholar] [CrossRef]
  43. Wright, S.P.; Collier, S.R.; Brown, T.S.; Sandberg, K. An analysis of how consumer physical activity monitors are used in biomedical research. FASEB J. 2017, 31, 1020.24. [Google Scholar]
  44. Elgendi, M. On the analysis of fingertip photoplethysmogram signals. Curr. Cardiol. Rev. 2012, 8, 14–25. [Google Scholar] [CrossRef] [PubMed]
  45. Chirinos, J.A.; Kips, J.G.; Jacobs, D.R.; Brumback, L.; Duprez, D.A.; Kronmal, R.; Bluemke, D.A.; Townsend, R.R.; Vermeersch, S.; Segers, P. Arterial wave reflections and incident cardiovascular events and heart failure: MESA (Multiethnic Study of Atherosclerosis). J. Am. Coll. Cardiol. 2012, 60, 2170–2177. [Google Scholar] [CrossRef] [PubMed]
  46. Kaess, B.M.; Rong, J.; Larson, M.G.; Hamburg, N.M.; Vita, J.A.; Levy, D.; Benjamin, E.J.; Vasan, R.S.; Mitchell, G.F. Aortic stiffness, blood pressure progression, and incident hypertension. JAMA 2012, 308, 875–881. [Google Scholar] [CrossRef] [Green Version]
  47. DeCoster, J.; Gallucci, M.; Iselin, A.-M.R. Best practices for using median splits, artificial categorization, and their continuous alternatives. J. Exp. Psychopathol. 2011, 2, 197–209. [Google Scholar] [CrossRef] [Green Version]
  48. Bent, B.; Goldstein, B.A.; Kibbe, W.A.; Dunn, J.P. Investigating sources of inaccuracy in wearable optical heart rate sensors. NPJ Digit. Med. 2020, 3, 18. [Google Scholar] [CrossRef] [Green Version]
  49. Sjoding, M.W.; Dickson, R.P.; Iwashyna, T.J.; Gay, S.E.; Valley, T.S. Racial bias in pulse oximetry measurement. N. Engl. J. Med. 2020, 383, 2477–2478. [Google Scholar] [CrossRef]
Figure 1. (ac) Visualization of preliminary motion artifact removal. (a) Slow baseline drift from non-periodic motion artifacts isolated using a moving mean to smooth out the sharper systolic peaks that are to be isolated. This smooth baseline is then subtracted from the original signal so that only the desired HR signal remains. This is a time-domain high-pass filter. (b) Relative systolic peaks are tracked using a moving maximum function over 0.5 s. (c) Heart rate amplitudes are normalized using the amplitudes tracked in (b) so that all systolic features are of the same amplitude. Typical positive swings of +1 are typically mirrored by asymmetric negative swings ~0.5, enabling identification of each heartbeat. Given that the diastolic peaks are typically < 0, any sharp peak > 0.5 is identified as a systolic feature.
Figure 1. (ac) Visualization of preliminary motion artifact removal. (a) Slow baseline drift from non-periodic motion artifacts isolated using a moving mean to smooth out the sharper systolic peaks that are to be isolated. This smooth baseline is then subtracted from the original signal so that only the desired HR signal remains. This is a time-domain high-pass filter. (b) Relative systolic peaks are tracked using a moving maximum function over 0.5 s. (c) Heart rate amplitudes are normalized using the amplitudes tracked in (b) so that all systolic features are of the same amplitude. Typical positive swings of +1 are typically mirrored by asymmetric negative swings ~0.5, enabling identification of each heartbeat. Given that the diastolic peaks are typically < 0, any sharp peak > 0.5 is identified as a systolic feature.
Sensors 23 03429 g001
Figure 2. Poor quality PPG signal in time domain. The very large swings are from contact to skin being lost and reformed. These large swings are also responsible for the streaking seen in the spectrogram in Figure 3b. In other words, sharp impulses in time become broad streaks in frequency through the short time Fourier transform embedded in the spectrogram in Figure 3.
Figure 2. Poor quality PPG signal in time domain. The very large swings are from contact to skin being lost and reformed. These large swings are also responsible for the streaking seen in the spectrogram in Figure 3b. In other words, sharp impulses in time become broad streaks in frequency through the short time Fourier transform embedded in the spectrogram in Figure 3.
Sensors 23 03429 g002
Figure 3. (ac) Signal quality spectrograms. (a) Good signal quality spectrogram. (b) Poor signal quality spectrogram. (c) Good signal quality spectrogram with motion artifacts.
Figure 3. (ac) Signal quality spectrograms. (a) Good signal quality spectrogram. (b) Poor signal quality spectrogram. (c) Good signal quality spectrogram with motion artifacts.
Sensors 23 03429 g003
Figure 4. Heart rate processing system for PPG signal.
Figure 4. Heart rate processing system for PPG signal.
Sensors 23 03429 g004
Figure 5. Association between PPG and ECG agreement (RMSE) and signal quality indices of STD-width (a) and self-consistency (b) in TROIKA and UofSC data. a Self-consistency is plotted as 1/self-consistency for ease of visual interpretation.
Figure 5. Association between PPG and ECG agreement (RMSE) and signal quality indices of STD-width (a) and self-consistency (b) in TROIKA and UofSC data. a Self-consistency is plotted as 1/self-consistency for ease of visual interpretation.
Sensors 23 03429 g005
Table 1. Individual session signal quality, accuracy, RMSE and MAE metrics for UofSC Bike Data.
Table 1. Individual session signal quality, accuracy, RMSE and MAE metrics for UofSC Bike Data.
SessionIDAccuracySelf-ConsistencySTD-WidthRMSEMAE (BPM)MAE (%)
1214.38%23.211.6326.3826.3424.90%
2312.38%26.211.8641.7835.3336.70%
3315.43%25.9111.1939.7634.3230.91%
4193.13%94.914.072.321.741.89%
5196.57%98.263.051.661.41.45%
6193.74%73.955.022.231.621.99%
7193.22%93.373.862.281.781.97%
8375.72%33.439.065.583.714.06%
9990.54%67.354.612.72.031.79%
10514.48%12.0512.8241.2632.630.94%
11262.83%58.148.5610.436.35.27%
12691.43%98.034.182.622.091.91%
13787.86%96.575.492.562.072.19%
141034.59%28.0412.6621.1215.7115.13%
151369.51%44.386.588.825.765.80%
16779.02%81.695.133.892.933.27%
171078.20%41.463.9015.466.585.83%
181177.73%38.738.965.13.523.58%
191489.67%83.963.922.952.111.81%
RMSE—root mean square error; MAE—mean absolute error; STD—standard deviation; accuracy defined as the % of points within 5 bpm of the criterion.
Table 2. Accuracy of HR estimation by signal quality in UofSC bike data.
Table 2. Accuracy of HR estimation by signal quality in UofSC bike data.
Adequate Signal (N = 14)Poor Signal (N = 5)
MeanMinMaxStd.
Deviation
MeanMinMaxStd.
Deviation
p < 0.01
Accuracy (%)84.2362.8396.5710.3418.2512.3834.599.20*
RMSE (BPM)4.901.6615.464.0234.0621.1241.789.62*
MAE (BPM)3.121.406.581.8228.8615.7135.338.14*
MAE (%)3.061.455.831.5927.7215.1336.708.18*
* p < 0.01 between poor and adequate signal quality.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

McLean, M.K.; Weaver, R.G.; Lane, A.; Smith, M.T.; Parker, H.; Stone, B.; McAninch, J.; Matolak, D.W.; Burkart, S.; Chandrashekhar, M.V.S.; et al. A Sliding Scale Signal Quality Metric of Photoplethysmography Applicable to Measuring Heart Rate across Clinical Contexts with Chest Mounting as a Case Study. Sensors 2023, 23, 3429. https://doi.org/10.3390/s23073429

AMA Style

McLean MK, Weaver RG, Lane A, Smith MT, Parker H, Stone B, McAninch J, Matolak DW, Burkart S, Chandrashekhar MVS, et al. A Sliding Scale Signal Quality Metric of Photoplethysmography Applicable to Measuring Heart Rate across Clinical Contexts with Chest Mounting as a Case Study. Sensors. 2023; 23(7):3429. https://doi.org/10.3390/s23073429

Chicago/Turabian Style

McLean, Marnie K., R. Glenn Weaver, Abbi Lane, Michal T. Smith, Hannah Parker, Ben Stone, Jonas McAninch, David W. Matolak, Sarah Burkart, M. V. S. Chandrashekhar, and et al. 2023. "A Sliding Scale Signal Quality Metric of Photoplethysmography Applicable to Measuring Heart Rate across Clinical Contexts with Chest Mounting as a Case Study" Sensors 23, no. 7: 3429. https://doi.org/10.3390/s23073429

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop