Next Article in Journal
Deep Learning for Tumor Segmentation and Multiclass Classification in Breast Ultrasound Images Using Pretrained Models
Previous Article in Journal
Automation of Detector Array Design for Baggage X-Ray Scanners
Previous Article in Special Issue
Real-Time Subject-Specific Predictive Modeling of PPG Signals for Artifact-Resilient SpO2 Estimation Under Hypoxia
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Signal Quality Assessment and Reconstruction of PPG-Derived Signals for Heart Rate and Variability Estimation in In-Vehicle Applications: A Comparative Review and Empirical Validation

1
School of Psychology, Georgia Institute of Technology, Atlanta, GA 30332, USA
2
Transportation Research Institute, College of Engineering, University of Michigan, Ann Arbor, MI 48109, USA
3
Driving Safety Research Institute, College of Engineering, University of Iowa, Iowa City, IA 52242, USA
*
Author to whom correspondence should be addressed.
Sensors 2025, 25(24), 7556; https://doi.org/10.3390/s25247556
Submission received: 18 September 2025 / Revised: 19 November 2025 / Accepted: 2 December 2025 / Published: 12 December 2025

Abstract

Electrocardiography (ECG) is widely recognized as the gold standard for measuring heart rate (HR) and heart rate variability (HRV). However, photoplethysmography (PPG) presents notable advantages in terms of wearability, affordability, and ease of integration into consumer devices, despite its susceptibility to motion artifacts and the absence of standardized processing protocols. In this study, we review current ECG and PPG signal processing methods and propose a signal quality assessment and reconstruction pipeline tailored for dynamic, in-vehicle environments. This pipeline was evaluated using data gathered from participants riding in an automated vehicle. Our findings demonstrate that while blood volume pulse (BVP) derived from PPG can provide reliable heart rate estimates and support extraction of certain HRV features, its utility in accurately capturing high-frequency HRV components remains constrained due to motion-induced noise and signal distortion. These results underscore the need for caution in interpreting PPG-derived HRV, particularly in mobile or ecologically valid contexts, and highlight the importance of establishing best practices and robust preprocessing methods to enhance the reliability of PPG sensing for field-based physiological monitoring.

1. Introduction

Cardiovascular health is a vital indicator of human well-being, with heart rate and its variability widely used to assess autonomic nervous system function. In clinical contexts, electrocardiogram (ECG or EKG) monitoring enables the detection of arrhythmias, myocardial infarctions, and other cardiac abnormalities [1]. Beyond clinical settings, heart rate (HR) and heart rate variability (HRV) have been used to assess stress, fatigue, and cognitive workload in in-field applications, particularly in driving, human factors, and mobile work environments [2].
While ECG remains the gold standard for cardiac monitoring, the growing availability of light-based photoplethysmography (PPG) technology has enabled wearable solutions, such as wristbands and finger-clip sensors, that estimate cardiovascular metrics by detecting blood volume pulse (BVP) changes in the skin. These PPG systems offer practical advantages including comfort, affordability, and ease of use, and have been increasingly adopted in applied research contexts, such as detection of driver drowsiness [3], driver mental fatigue [4], and passenger ride comfort [5]. Despite these benefits, PPG presents several limitations. Compared to ECG, it is more susceptible to motion artifacts [6], skin tone variability [7], and inconsistent sensor contact pressure [6,8]. In addition, the lack of standardized hardware configurations, acquisition protocols, and signal processing pipelines introduces variability that undermines reproducibility and generalizability [9]. Although pulse rate estimates derived from PPG are generally accurate in healthy individuals at rest, questions remain regarding their validity as a surrogate for ECG-derived HRV, especially in dynamic or ecologically relevant conditions typical of in-field applications. Recent advances in machine learning-assisted point-of-care diagnostics further illustrate the movement toward real-time, automated cardiovascular monitoring [10].
To address this gap, the present study pursues three primary objectives. First, we review the existing literature on ECG and PPG signal characteristics and outline emerging trends in signal processing methods relevant to wearable and mobile sensing. Second, we evaluate the effectiveness of current BVP processing techniques for cardiovascular tracking using a dataset gathered from participants during an in-vehicle field study. Finally, we provide practical recommendations and highlight key considerations for implementing and refining PPG-based monitoring pipelines for in-vehicle and other dynamic applications.

1.1. ECG and PPG Processing: Previous Research

1.1.1. ECG

ECG measures the electrical activity generated by the depolarization and repolarization of the heart muscle during each cardiac cycle [11]. In clinical and laboratory settings, ECG signals are typically acquired using electrodes placed on the chest and limbs to ensure high-fidelity recordings [12]. For ambulatory and activities-of-daily-living monitoring, portable ECG devices such as Holter monitors allow continuous tracking of cardiac rhythms, particularly for the detection of arrhythmias. These devices are commonly worn with chest straps; however, their bulk and prolonged skin contact can lead to discomfort or irritation [13].
Each heartbeat in the ECG waveform is characterized by a sequence of electrical events: the P-wave (atrial depolarization), QRS complex (ventricular depolarization), and T-wave (ventricular repolarization) (see Figure 1) [1]. Processing of ECG signals generally involves high-pass and low-pass filtering, fiducial point detection using established algorithms, and extraction of features for subsequent analysis [1,14,15,16,17,18,19,20].
Key ECG-derived parameters include HR and HRV. HRV provides insight into autonomic nervous system function and is typically quantified through time-domain, frequency-domain, and nonlinear (geometric and entropy-based) measures [20,21]. Time-domain metrics such as SDNN (standard deviation of all NN intervals), RMSSD (root mean square of successive differences), and pNN50 (the percentage of NN interval differences greater than 50 msec) quantify beat-to-beat interval variability. These metrics focus specifically on normal-to-normal (NN) intervals and are primarily indicative of parasympathetic modulation. Frequency-domain metrics decompose heart rate variability into spectral components: ultra-low frequency (ULF) and very-low frequency (VLF) components reflect long-term regulatory mechanisms; low-frequency (LF) power captures mixed sympathetic-parasympathetic modulation; high-frequency (HF) power is associated predominantly with parasympathetic (vagal) activity; and the LF/HF ratio is commonly interpreted as an index of sympathovagal balance, although this interpretation remains debated in the literature. Nonlinear metrics include Poincaré plot-based SD1/SD2 ratio, approximate entropy, sample entropy, multiscale entropy (MSE), and detrended fluctuation analysis (DFA) of heartbeat time series. These measures provide insights beyond linear analyses, particularly in understanding autonomic regulation under stress, fatigue, or cognitive load.
Although 5 min ECG recordings are typically recommended for robust HRV estimation, particularly for frequency-domain features like LF and HF, shorter segments may yield valid estimates under controlled conditions [21,22]. ECG signal processing techniques are well-standardized and governed by clinical guidelines issued by the American Heart Association (AHA) and the American College of Cardiology (ACC) [15]. Table 1 summarizes key ECG processing procedures and corresponding standards.

1.1.2. BVP

PPG is a light-based, noninvasive technology used to measure blood volume changes in the microvascular bed of the skin. A typical PPG system includes light-emitting diodes (LEDs) and a photodetector that captures either the absorbed or reflected light, depending on tissue perfusion. PPG sensors are commonly configured in one of two ways: Transmissive PPG allows light to pass through the tissue (e.g., fingertip or earlobe) where the photodetector is positioned opposite the emitter. This configuration generally offers higher signal clarity due to its well-defined optical path. Reflective PPG, by contrast, places the emitter and detector on the same side of the tissue. Light scattered back from the underlying vasculature is captured, following a nonlinear path. While this setup is more prone to noise and motion artifacts, it is more versatile and can be applied to body sites like the wrist, while transmissive sensing is impractical [22,23]. PPG sensors are most commonly integrated into smartwatches due to their convenience and wearability [13]. However, BVP waveform quality varies with the sensor’s anatomical location. Finger-based measurements consistently yield the highest signal fidelity, outperforming those from the wrist, arm, earlobe, and forehead in terms of analyzable waveforms [24].
Unlike ECG, which features well-defined QRS complexes, BVP waveforms primarily comprise systolic and diastolic phases (see Figure 1). These are highly sensitive to external influences, such as motion, ambient light, and sensor pressure, factors that distort fiducial points (e.g., pulse onset and peak), complicating feature extraction. Fine et al. [25] reviewed sources of inaccuracy in PPG, attributing them to physiological variability, vascular characteristics, and external technical conditions. While some sources of artifact, such as ambient light and pressure inconsistencies, can be mitigated through hardware design, motion artifacts remain particularly problematic in mobile or in-vehicle contexts. Their frequency overlaps with physiological pulse rates, making them difficult to filter cleanly using traditional methods.
To date, no universal gold standard exists for BVP signal processing [9,22]. Table 2 summarizes commonly employed processing options. Preprocessing often begins with bandpass filtering and pulse detection algorithms [22]. Artifact removal methods include adaptive filters [26,27,28], signal decomposition techniques (e.g., wavelets and empirical mode decomposition) [28,29,30], and hybrid approaches [31,32,33]. More recently, deep learning models, such as CycleGANs [34] and wavelet-based neural networks [35], have been explored to reconstruct clean signals from noisy inputs.
Due to the persistent challenges in isolating motion artifacts, signal quality assessment (SQA) has emerged as a critical preprocessing step. These methods aim to distinguish clean, analyzable PPG signals from corrupted ones. Orphanidou [36] offered an early review of automated filtering and machine learning-based classification techniques. More recent studies have achieved high classification performance: Shin [35] trained a deep convolutional neural network (CNN) model that achieved an area under the curve (AUC) of 0.98 using Bayesian optimization. Mohagheghian et al. [37] proposed an ensemble-based feature selection method, yielding over 93% classification across diverse signal types. Moscato et al. [38] introduced a dual-support vector machines (SVM)-based model, with one machine optimized for heart rate estimation and the other for waveform morphology, achieving classification accuracies of 0.96 and 0.97. Despite these advancements, most models rely on expert-annotated labels, introducing subjectivity, and require large and well-curated datasets for effective training, which remain scarce in real-world in-vehicle contexts.
Once cleaned, BVP signals are often processed using techniques adapted from ECG-based HRV analysis. Accurate identification of fiducial points, including the pulse onset, systolic peak, dicrotic notch, and diastolic peak, is essential and can be achieved using methods based on derivatives, slope thresholds, wavelets, or other mathematical operators [39]. In a benchmark study, Charlton et al. [40] evaluated 15 open-source algorithms. Among them, Multi-Scale Peak and Trough Detection (MSPTD) and qppg demonstrated superior performance across diverse scenarios, including exercise, neonatal recordings, and atrial fibrillation. When time-domain fiducials are unreliable due to noise, frequency-domain alternatives (e.g., spectral peak tracking) may be used to estimate pulse rate, as showcased in the IEEE Signal Processing Cup (SPC) 2015 challenge [41,42,43].
Despite progress, real-world deployment of PPG remains constrained by the limited availability of ecologically valid datasets. While several public datasets provide multimodal recordings (e.g., PPG, ECG, accelerometry) [44,45,46,47,48,49,50], most were collected under controlled or sedentary conditions. For example, the MIT-BIH Polysomnographic Database [44] contains multiple physiologic signals recorded during sleep studies, while the Complex System Lab’s dataset [46] offers clinically annotated recordings for conditions such as sepsis, traumatic brain injury, and cardiac events. Only a few datasets capture recordings during physical activity or motion-rich contexts [47]. Jarchi and Casson [48] extended this effort by collecting synchronized PPG, ECG, and motion sensor data (accelerometers and gyroscopes) during activities such as walking, biking, and running. However, their recordings remain limited to structured indoor environments. In contrast, naturalistic settings, such as in-vehicle conditions, introduce a range of complex disturbances, including vibration, abrupt directional changes, fine motor activity, ambient light variation, and distributional signal drift. These real-world complexities are poorly represented in existing datasets, limiting model generalization and constraining reliable field deployment.
Table 2. Recommended current practices for PPG-derived signal processing for applications involving HR and HRV estimation.
Table 2. Recommended current practices for PPG-derived signal processing for applications involving HR and HRV estimation.
TypeNormsReferences
Sampling Rate≥25 Hz is suggested.[51]
Low Frequency Filtering0.5 Hz to remove the direct current (DC) component below 0.1 Hz and respiratory component in the 0.1–0.5 Hz band.[22]
High Frequency Filtering10 Hz, corresponding to the position of fourth harmonics at 150 bpm, or third harmonics at 200 bpm.[22]
Artifact Removal- Decomposition-based: ICA, EMD, wavelet decomposition.
- Adaptive filtering: RLS, LMS.
- Deep neural network.
[26,27,28,29,30,31,32,33,34,35,52]
Fiducial Point Identification- Zero crossing (change of slope sign).
- Local maxima/minima with adaptive thresholding.
- Deep neural network.
[39,40]
Signal Quality Index- Machine learning with statistical features.
- Deep neural network.
[36,37,38]
Feature Extraction- PR: spectral peak tracking.
- PRV: Extract the pulse-pulse interval and calculate features similar to HRV.
[41,42,43,53]

1.1.3. Studies Using Pulse Rate Features as Natural Proxies for Heart Rate Features

Several studies have examined the validity of pulse rate (PR) and pulse rate variability (PRV) as proxies for ECG-derived HR and HRV. Under resting conditions, PPG-derived PR and PRV often show high agreement with ECG-based metrics, requiring minimal artifact correction [54,55]. For instance, Menghini et al. [56] evaluated PRV across resting, public speaking, and recovery tasks, finding strong agreement with ECG-HR at rest but diminished PRV accuracy with even minor movement. Similarly, Milstein and Gordon [57] reported divergence between PRV and ECG-HRV during dyadic conversations, highlighting sensitivity to behavioral and social engagement. In a six-week neurofeedback study, Schuurmans et al. [58] observed consistent correlations between PR and ECG-HR across sessions, whereas PRV exhibited variability during cognitively or emotionally demanding tasks. Van Voorhees et al. [59] compared wearable PPG and Holter ECG recordings during ambulatory activity and found low PRV reliability over short windows (1–5 min), reinforcing concerns about PRV robustness in field-based, non-resting contexts. In addition, peripheral PRV may capture physiological variance absent in ECG-derived HRV. Mejía-Mejía [60] reported that PRV responses to cold exposure differed from ECG-HRV, particularly at peripheral measurement sites, likely due to vascular or thermoregulatory effects. Because PPG captures the pulse wave after a temporal delay relative to cardiac depolarization, PRV may also reflect influences from the peripheral nervous system and local vascular dynamics [61]. Consequently, the use of PRV as a surrogate for HRV remains an area of active investigation, particularly in dynamic, in-vehicle or field-based environments where motion artifacts, signal degradation, and physiological noise complicate reliable estimation.
In summary, ECG directly measures cardiac electrical activity with clearly defined fiducial points and minimal motion sensitivity, whereas PPG reflects peripheral hemodynamic fluctuations that are inherently more susceptible to motion and sensor-contact variability. Recent developments in cardiovascular monitoring increasingly integrate PPG with auxiliary sensors (e.g., accelerometers) or employ deep-learning-based denoising and signal-quality modeling to mitigate these limitations. Together, these efforts mark a broader shift toward hybrid and context-aware processing frameworks that aim to balance interpretability, robustness, and ecological validity.

1.2. Present Study Objectives

Given the diversity of SQA approaches and BVP processing methods, there remains a critical need for a processing pipeline tailored specifically to the challenges of in-field PPG acquisition, particularly in dynamic, motion-rich environments such as vehicles. In these contexts, PPG signals are often degraded by motion artifacts, variable lighting, and sensor displacement, requiring more advanced processing than those used in traditional resting-state applications. Unlike most psychophysiological studies focused on sleep or sedentary conditions, emerging use cases involve light to moderate activity, such as passive vehicle riding or task engagement. To address these challenges, we propose a lightweight and interpretable PPG processing pipeline designed specifically for pulse rate monitoring under real-world dynamic conditions. Our approach prioritizes transparency and physiological plausibility, leveraging domain-informed signal features rather than relying on data-intensive deep learning models. This design achieves robust performance with reduced computational overhead and increased interpretability. The proposed pipeline is validated using a custom dataset collected from participants riding as passengers in an autonomous vehicle, a representative in-field setting that captures realistic sources of signal degradation. Distinct from prior multi-sensor fusion or black-box data-driven deep learning approaches, this study introduces an interpretable, single-channel SQA and spectral reconstruction pipeline calibrated against ECG-derived ground truth, specifically designed for motion-rich conditions in vehicle environments.
The primary aim of this study is to (1) assess the accuracy of PPG-derived HR and HRV estimates under dynamic, in-vehicle conditions, and determine which parameters can reliably serve as surrogates for ECG-derived features; (2) evaluate the effectiveness of various signal processing techniques, including signal quality thresholding, spectral reconstruction, and fiducial point detection, in recovering and improving PPG-derived metrics with respect to ECG references; and (3) explore trade-offs between maximizing useable data and preserving signal fidelity, and offer practical recommendations for the design and deployment of PPG signal processing pipelines in ecologically valid, real-world applications.

2. Methods

2.1. Data Source

A total of 114 trials (57 women and 57 men) were included for the current analysis. Data were collected as a part of a closed test-track study, with participants seated in the front passenger seat of an instrumented automated vehicle (AV). During the session, participants experienced 30 consecutive AV-generated acceleration-braking events while engaging in a visual task on a handheld device. Each event consisted of an acceleration from rest, a constant-speed cruising phase of variable duration, and a deceleration to a complete stop. These data were collected as part of a larger study on passenger motion sickness. The structured event sequence emulated stop-and-go traffic conditions and passenger engagement with handheld devices, enabling assessment of PPG signal resilience to motion artifacts during visually demanding, dynamic conditions.
Each participant wore two physiological monitoring devices to enable simultaneous ECG and PPG data collection. ECG signals were acquired using the ZephyrTM BioharnessTM 3 (Medtronic, Minneapolis, MN, USA) device, a wearable chest strap positioned immediately below the pectoral muscle, with the Biomodule aligned under the left arm. The BioHarness device employs a single-lead ECG configuration using conductive chest-strap electrodes and records R-R intervals at 250 Hz. Strap tension was standardized across participants using the integrated tension indicator loop to ensure consistent electrode contact. PPG signals, recorded as BVP, were collected using the Empatica E4 wristband (Empatica, Inc., Cambridge, MA, USA) worn on the non-dominant wrist just proximal to the wrist joint. The PPG sensor, embedded in the underside of the wristband, captured vascular signals at 64 Hz by measuring variations in reflected light, as blood volume changed with each heartbeat. In addition to PPG, the Empatica E4 recorded three-axis accelerometer data at 32 Hz, providing contextual information about movement during the in-vehicle protocol (see: https://www.empatica.com/en-int/store/e4-wristband/, accessed on 4 September 2025). It should be noted that, as of February 2025, the Empatica E4 and its associated software suite were formally discontinued (see: https://www.empatica.com/research/e4-sunset/, accessed on 4 September 2025). This discontinuation does not affect the integrity of data collected prior to the sunset period.
Prior to the in-vehicle session, participants completed two baseline recordings. First, a five-minute resting baseline was recorded while participants sat comfortably in a quiet indoor environment, following procedures outlined by Winslow et al. [62]. A second three-minute resting baseline was conducted inside the stationary vehicle. For both recordings, participants were instructed to remain still, avoid speaking, and minimize movement to ensure signal integrity. Immediately after the in-vehicle baseline, the experimental condition began. Both the Zephyr BioHarness and Empatica E4 devices recorded continuously throughout the ~20 min in-vehicle riding protocol.
Four trials were excluded from the final dataset due to quality issues: one due to absence of PPG data during the BioHarness recording period, one due to an early ECG termination, and two due to abnormal ECG waveforms lacking discernible beat structure. As a result, a total of 110 trials were included in the final analysis.

2.2. Data Processing

An overview of the data processing pipeline and the evaluation framework for assessing PR and PRV as surrogates for ECG-derived HR and HRV is shown in Figure 2. Detailed descriptions of the SQA modeling, signal reconstruction, fiducial point recognition, and feature extraction are provided in the following sections.
The processing of BioHarness ECG signals followed established gold-standard recommendations from the literature (see Table 1). Raw ECG signals were processed using the Pan–Tompkins algorithm, a widely used QRS-complex detection method based on derivative filtering, adaptive thresholding and integration [19]. The processed ECG data were then segmented into analysis epochs of varying lengths, ranging from 5 to 30 s. Within each epoch, interbeat intervals (IBI) were computed as the time difference between consecutive QRS detections. To improve data quality a two-step IBI outlier removal procedure was applied: (1) physiological range filtering: IBIs outside 300–1500 msec (equivalent to 40–200 bpm, the typical heart rate range in healthy adults) were excluded, consistent with established thresholds [15]; and (2) relative deviation filtering: IBIs deviating more than 30% from the mean of the four most recently accepted IBIs were also removed. The initial four IBIs were manually reviewed and selected to establish a reliable starting reference. Because ECG provides unambiguous fiducial points that remain stable under mild motion, it served as the gold-standard reference for all subsequent comparisons. PPG-derived PR and PRV metrics were then evaluated relative to these ECG-derived HR and HRV measures to quantify reliability.
Signal processing of the PPG data followed common practices outlined in the literature (see Table 3). The PPG signal, collected from the Empatica E4 wristband, was filtered using a fourth-order Butterworth bandpass filter with cutoff frequencies at 0.5 Hz and 10 Hz, to suppress baseline drift and high-frequency noise while preserving frequency components relative to heart rate, including up to the third harmonic of 200 bpm. Concurrent accelerometer data were filtered using a bandpass filter with cutoff frequencies of 0.025 Hz and 10 Hz to remove motion artifacts and DC bias. These preprocessed accelerometer signals were then incorporated into the signal-quality assessment model through a set of time- and frequency-domain motion features (e.g., amplitude variability, spectral power, and cross-spectral coupling with BVP) to quantify motion-induced contamination. Both PPG and accelerometer signals were segmented into analysis windows of 5 to 30 s, synchronized with the ECG-based epochs. Within each epoch, a fast Fourier transform (FFT) was performed to extract frequency-domain features with a 1 bpm resolution. An SQA classifier was applied to each epoch, trained using binary labels (e.g., usable vs. unusable) determined by the discrepancy between PR, estimated from the dominant spectral peak in the PPG signal, and the reference HR derived from ECG. Epochs labeled as unusable were excluded from subsequent analyses. The remaining PPG epochs underwent frequency-domain signal reconstruction using a notch-filter-based denoising procedure to further enhance waveform quality. From these cleaned signals, PRV features were extracted using the same computational framework applied to ECG-derived HRV, enabling direct comparison. Finally, ablation studies were conducted to examine the relative contribution of each component, SQA and signal reconstruction, by systematically removing these steps and evaluating the trade-offs between data retention and feature accuracy when using PPG-derived metrics as surrogates for ECG-based HR and HRV.

2.2.1. Signal Quality Assessment

We developed an SQA classification model using a comprehensive set of time- and frequency-domain features derived from both PPG-derived BVP and accelerometer signals. This model builds upon and extends methods proposed in prior literature [30,36,37,38]. Figure 3 illustrates the full signal-quality assessment (SQA) pipeline, including preprocessing, frequency-domain analysis, and feature extraction for both BVP and accelerometer signals. A detailed list of features incorporated in the SQA model is provided in Table 3. Among the morphological features, we included kurtosis, skewness, and Shannon entropy of the BVP signal, as these metrics are widely recognized as reliable indicators of signal quality [63,64]. In the frequency domain, we extracted the relative power of the estimated heart rate frequency (HRF), specifically the dominant spectral peak within the HR band, and its second and third harmonics, to characterize the spectral sharpness and prominence of the HR-related peaks [65]. To identify abrupt disruptions caused by motion or noise, we computed the absolute difference between the current HRF and the 1 min median HRF. Following prior work [63], we also included frequency-domain kurtosis of the BVP signal and bispectral self-coupling, which reflect spectral sparsity and cross-frequency coupling, respectively. To capture motion-related interference, we derived the mean and standard deviation of the accelerometer signal magnitude, along with the relative power of the accelerometer signal within the heart rate band and the maximum cross-bicoherence between the accelerator and PPG signals, quantifying the extent of motion contamination in the frequency domain [66]. Because it relies on a compact set of twelve explicitly defined signal-quality and motion-related features, the proposed model remained computationally lightweight, requiring only standard feature extraction and a single classification step, while maintaining direct physiological interpretability for each feature.
Epoch labeling for training the SQA classifier was performed automatically. Epochs were labeled as “acceptable” if the BVP signal contained the dominant spectral peak that matched the ECG-derived HR within ±5 bpm tolerance. This frequency-domain approach enabled a preliminary signal-quality check, reducing reliance on morphological pulse detection and allowing a more flexible thresholding for identifying usable epochs. These quality labels also served as a basis for downstream signal reconstruction, discussed in the subsequent section.
Table 3. Feature extraction of PPG-derived and accelerometer sensor measures for the SQA classification model.
Table 3. Feature extraction of PPG-derived and accelerometer sensor measures for the SQA classification model.
SourceFeatureDescriptionReferences
BVP time domainKurtosisScaled version of the fourth moment of the PPG distribution, representing the tailedness of the PPG signal distribution. Measures peakedness of the waveform; high kurtosis may indicate clean, sharp pulse waves[30]
SkewnessMeasure of the asymmetry of the PPG signal around zero. Describes waveform asymmetry; deviations from symmetry may signal noise or distortion[30]
Shannon entropyMeasure of the disorder in the PPG signal probability distribution. Quantifies signal complexity; higher entropy may suggest irregularity due to noise[30]
BVP frequency domainSpectral kurtosisScaled version of the fourth moment of the PPG spectral distribution, representing the tailedness of the PPG frequency-domain signal. Detects spectral sparsity; flatter spectra may indicate noise or artifact[30]
Relative power of dominant peakPower ratio of the dominant peak in the PPG spectrum compared to the total power. Power of the peak frequency in the heart rate band; used to confirm signal periodicity[67]
Relative power of harmonicsPower ratio of the 2nd and 3rd harmonics of the PPG spectral dominant peak compared to the total power. Power in harmonic components; supports waveform integrity checks[67]
HRF deviation from moving medianAbsolute difference between the spectral peak of the current PPG epoch and the median spectral peak of the nearest 1 min segment. Measures abrupt change in pulse frequency; used to detect transient noise
Bispectral self-couplingNumber of self-coupling events among the three most prominent peaks (f0, f1, f2) in the diagonal slice of the bispectrum. Assesses cross-frequency coupling around HRF; reduced coupling may signal distortion[68]
Accelerometer time domainAmplitude meanAverage magnitude of the accelerometer data. Indicates overall movement intensity; elevated values may suggest a potential motion artifact[38]
Amplitude SDStandard deviation of the accelerometer magnitude. Captures motion variability; high standard deviation often correlates with motion-induced noise[38]
Accelerometer frequency domainMaximal cross-bicoherence to PPGMaximum bicoherence between the PPG signal and the accelerometer data from the x-, y-, or z-axis. Measures motion energy overlapping the HR band; used to detect confounding artifact sources[66]
Relative power of heart rate frequency bandRelative power of the [2/3, 10/3] Hz band, corresponding to the heart-rate frequency band ranging from 40 BPM to 200 BPM. Estimates nonlinear coupling between motion and the pulse signal; high values imply motion contaminationInspired by [69]
A total of 12 features were computed for each epoch. To optimize feature selection we applied the minimum redundancy maximum relevance (mRMR) algorithm [70], which uses mutual information to identify features that maximize predictive relevance while minimizing inter-feature redundancy. A series of models was iteratively trained by progressively including top-ranked features, from the single highest-ranked feature up to the full feature set. To assess classification performance, we trained models using various classifiers, including Support Vector Machine (SVM), Linear Discriminant Analysis (LDA), Naive Bayes (NB), and Logistic Regression. In order to evaluate the influence of epoch duration, separate models were trained for 5, 10, 20, and 30 s epochs. For each epoch length, participant data were segmented into minute-long intervals, with 80% used for training and 20% randomly sampled for testing.
Due to the labeling method, most epochs were classified as “acceptable”, resulting in a class imbalance. To mitigate this, we performed a grid search to optimize the misclassification cost ratio between the majority and minority classes (range: 1–10). For each model type and epoch length, the optimal number of features and class weight ratio were determined by maximizing the minimum of sensitivity and specificity using five-fold cross-validation.

2.2.2. Signal Reconstruction

Signal reconstruction was applied to all epochs classified as acceptable by the SQA modeling using a targeted notch-filtering approach adapted from [71]. Unlike traditional Independent Component Analysis (ICA)-based denoising, which requires multiple PPG sources, this method is optimized for a single-channel wrist-worn PPG signal, such as those recorded by the Empatica E4. While adaptive filtering and wavelet decomposition are widely used for noise suppression, they typically depend on the availability of a reference signal correlated with motion artifacts. When this assumption is violated, such methods risk attenuating physiologically meaningful components of the PPG waveform. Motion-based denoising strategies can be effective during high-intensity exercise, where motion dominates the noise spectrum, but are less suitable for passive dynamic environments like vehicle rides, where artifact sources are more heterogeneous and often uncorrelated with gross movement. Accordingly, we adopted a targeted frequency-domain reconstruction method that leverages the reliability of estimated heart rate frequency (HRF) and its second harmonic. The raw PPG signal was passed through a cascade of notch filters centered at the dominant HRF and its second harmonic. Each notch filter was implemented as a 4th-order Butterworth band-stop IIR filter with half-power frequencies set at ±20% around the estimated HRF and its harmonic. At a 64 Hz sampling rate of the E4 device, this configuration yielded rejection bands of approximately 0.2–0.4 Hz for typical adult HR ranges (1–2 Hz), effectively attenuating motion-related peaks near the HRF while preserving the surrounding physiological content. The filtered signal, representing noise localized to the HRF and harmonic bands, was subtracted from the raw PPG waveform to reconstruct the cleaned signal. This approach preserves physiologically relevant components while suppressing confounding, in-band noise not associated with the dominant cardiac rhythm. Figure 4 illustrates representative PPG waveforms before and after this reconstruction procedure.

2.2.3. Fiducial Point Detection and Evaluation

Following signal thresholding and reconstruction, we applied two fiducial point detection methods to identify pulse peaks and onsets in the PPG time series. The first method, Multi-Scale Peak and Trough Detection (MSPTD), is widely used in pulse detection research. MSPTD segments the signal, generates a local maxima scalogram across multiple temporal scales, identifies the most relevant scale for pulse detection, and refines the peak and onset locations within a narrow time window [72]. This method has been shown to outperform several competing algorithms across diverse datasets in previous studies (see Section 1).
In addition to MSPTD, we employed a method developed at the University of Iowa’s Driving Safety Research Institute, named Multivariate Normal Density Estimation and Imputation (MNDEI). Beginning with the E4 IBI data, a multivariable normal probability density function was built using the BVP and its second derivative at determined IBI points, as well as the standard deviation of BVP in a one-second window around IBI points. To better capture normal distributions, the log of all three variables was used. This density function was used to detect other BVP points that fell within its distribution. Further refinement was achieved by (a) removing abnormally small IBIs and abnormally isolated pulses, (b) imputing missing IBI samples using available local BVP minima, and (c) repeating the removal of abnormally small IBIs and isolated pulses.
To evaluate how different combinations of processing methods affect the accuracy of BVP-derived HR and HRV features, we conducted a comparative analysis across six configurations:
(1)
Raw signal + MSPTD.
(2)
Raw signal + MNDEI.
(3)
SQ-threshold signal + MSPTD.
(4)
SQ-threshold signal + MNDEI.
(5)
SQ-threshold signal + reconstruction + MSPTD.
(6)
SQ-threshold signal + reconstruction + MNDEI.
For each condition, we computed heart rate and a standard set of time-domain and frequency-domain HRV features from both ECG and PPG signals. ECG-derived HRV was treated as the ground truth reference. HRV metrics were calculated using the PhysioNet Cardiovascular Signal Toolbox [73], with a 5 min window and 30 s step size applied across the trial data. As per prior recommendations, ECG-derived epochs with more than 15% missing data were excluded from analysis, particularly for high-frequency features, which are sensitive to data gaps [74]. In total, seven trials were excluded due to insufficient ECG coverage, resulting in 2612 valid ECG epochs. For PPG-derived epochs, a 50% missing-data threshold was adopted. Stricter thresholds (e.g., 15% or 25%) resulted in no process pipeline achieving more than 40% epoch coverage, significantly limiting analysis viability. To evaluate the agreement between ECG- and PPG-derived metrics, we conducted the following: Pearson correlation to assess linear relationships, Cliff’s δ to assess distributional effect sizes and agreement, and Bland–Altman analysis to evaluate the bias and limits of agreement (LOA). These analyses provide insight into the relative performance of different fiducial point detection and signal processing pipelines under dynamic, in-vehicle conditions.

3. Results

3.1. Signal Quality Assessment Classification Model: Performance

A detailed summary of the cross-validation results for the SQA classification models is presented in Table 4. Among the evaluated models, the classifier trained on 10 s epochs outperformed those trained on shorter (5 s) or longer (20 s and 30 s) time windows. This model achieved the highest balance between accuracy, specificity, and sensitivity. The best performance was observed using an SVM classifier with eight features selected via mRMR and a misclassification cost ratio of 3, resulting in a minimum specificity and sensitivity of 0.884 during five-fold cross-validation. This configuration was selected as the final model and subsequently evaluated on the 20% participant-level holdout test set. On this unseen data, the model achieved an overall accuracy of 0.888, a specificity of 0.892, a sensitivity of 0.887, and an AUC of 0.960. These results demonstrate the model’s robustness and generalizability in distinguishing usable versus unusable PPG epochs for subsequent analysis.

3.2. Comparison of Signal Processing Methods for PRV Features

3.2.1. Percentage of Valid Epochs

Figure 5 presents a violin plot illustrating the distribution of the percentage of valid epochs retained for HRV analysis trials and processing methods. This analysis quantifies the proportion of epochs that yielded usable HRV data, offering insight into the data retention trade-offs for each method. The Empatica E4’s native HR output, provided at 1 Hz resolution, is smoothed and continuous, regardless of the confidence indicator in the output. For this signal, the average heart rate was computed as the cumulative mean of the E4 HR output within each epoch. No HRV features were derived from the native E4 HR, due to its smoothing and preprocessed nature.
For all other methods, including those employing fiducial point detection algorithms (e.g., MSPTD and MNDEI), both HR and HRV metrics were computed, and only epochs with ≤50% missing data were retained for analysis. The E4 IBI output, used without manual cleaning, showed poor data retention: in 50% of trials, fewer than 12.5% of epochs were considered valid.
The use of the MNDEI method significantly improved epoch inclusion, while MSPTD yielded the highest number of valid epochs across trials. As expected, the application of SQ thresholding reduced the number of retained epochs, as its purpose is to exclude segments that do not provide reliable pulse information. However, the inclusion of signal reconstruction following SQ thresholding led to modest recovery in valid epoch count, as the cleaned signals enabled pulse detection in previously borderline or noisy segments. Overall, the combined application of SQ thresholding and signal reconstruction resulted in stable performance, with a median of approximately 75% valid epochs retained across trials.

3.2.2. Pearson Correlation

Using the accepted epochs, as defined in the prior selection, Figure 6 presents the Pearson correlation coefficients between ECG-derived HR and HRV metrics and their corresponding PPG-derived PR and PRV metrics, computed across the different processing pipelines. A consistently strong correlation in HR is observed for all processing methods, exceeding the correlation observed for the native E4 HR output. Despite the low percentage of valid epochs noted previously, the E4 IBI output shows strong correlations (r > 0.8) with most HRV features, except for RMSSD, which appears to be more sensitive to signal noise and preprocessing variability. The MSPTD method, although associated with a high valid epoch percentage, shows weak correlations (r ≤ 0.3) for frequency-domain HRV metrics (VLF, LF, HF) and moderate correlation (r ≤ 0.5) for time-domain metrics (SDNN, RMSSD) and ULF. The application of SQ thresholding prior to MSPTD slightly improves correlations, while the combined application of SQ thresholding and reconstruction further enhances performance. Similar trends are observed with MNDEI. When used without preprocessing, MNDEI achieves modest correlations; however, when paired with SQ thresholding and reconstruction, it demonstrates the strongest agreement with ECG-derived metrics among the processing methods. Specifically, this full method (SQ thresholding + reconstruction + MNDEI) yields the strongest correlations for all HRV variables except RMSSD and HF, where results remain moderate but improved compared to other methods.

3.2.3. Cliff’s δ Effect Size

Figure 7 presents Cliff’s δ effect sizes, quantifying the degree of distributional non-overlap between PPG-derived metrics and ECG-derived ground truth across different signal processing methods. This nonparametric measure complements Pearson correlation by evaluating the magnitude and direction of bias in estimated HRV metrics. Among all the variables examined, RMSSD and HF power are particularly prone to systematic overestimation by PPG-derived measures, with all methods, including E4 IBI output, yielding large effect sizes (δ ≥ 0.474). SDNN and LF also exhibit overestimation, though to a more moderate degree. In contrast, ULF and VLF show a tendency toward underestimation, depending on the processing method. For all HRV variables except RMSSD and HF, the magnitude of distributional divergence is reduced when applying SQ thresholding and reconstruction, suggesting these steps improve agreement with ECG. However, for RMSSD and HF, even the most refined processing methods fail to reduce the effect size below the large non-overlap threshold, highlighting the persistent difficulty in recovering these parasympathetic-sensitive metrics from wrist-worn PPG.

3.2.4. Bland–Altman Analysis

A Bland–Altman analysis was conducted to assess the agreement between PPG- and ECG-derived HRV features. Prior to analysis, the Kolmogorov–Smirnov test was applied to all feature distributions, confirming that none of the HRV variables followed a normal distribution. As a result, bias was calculated using the median difference between PPG-derived PRV and ECG-derived HRV values. The limits of agreement (LOAs) were defined as the 2.5th and 97.5th percentiles of the paired differences. The resulting bias and LOAs for each processing method are presented in Figure 8. Among the manual processing methods, both MSPTD and MNDEI demonstrated less median bias when SQ thresholding and reconstruction were applied. MSPTD combined with SQ control also led to a narrower spread in bias, indicating improved precision. Notably, when MNDEI was combined with both SQ thresholding and signal reconstruction, the resulting bias values were comparable to those observed for the E4 IBI output, while still retaining substantially more valid epochs per trial. This result suggests that preprocessing steps not only improve agreement with ground truth but also preserve a higher volume of analyzable data, which is critical for in-field applications.

3.2.5. Summary of Signal Processing Methods for PRV Features

To facilitate direct comparison across processing approaches, we summarize the average performance of each pipeline in Table 5. The table reports the percentage of valid epochs retained, Pearson correlation coefficients, Cliff’s δ effect sizes, and median bias values for both HR and HRV metrics relative to ECG-derived ground truth. The results show that while manufacturer-processed E4 IBI data yield the highest HR correlation, they capture fewer valid epochs (22.7%). Manual pipelines such as MSPTD and MNDEI substantially improve data retention but introduce greater HRV bias. Incorporating the signal-quality assessment (SQA) threshold and spectral reconstruction increases HRV correlation and reduces bias, most notably for the SQA Recon MNDEI method, which achieved a Pearson r = 0.79 and reduced bias to 41.66% while maintaining approximately 70% valid data. These outcomes highlight the advantage of coupling lightweight SQA filtering and reconstruction steps to enhance PPG reliability under dynamic, in-vehicle conditions.

4. Discussion

This study evaluated the accuracy and limitations of PPG for estimating HR and HRV under dynamic conditions, using ECG-derived metrics as the gold standard. While PPG-based sensors are increasingly favored in wearable technology due to their comfort, affordability, and ease of integration, our findings reveal that their reliability as a surrogate for ECG varies substantially across cardiovascular features and environmental context.
Our results demonstrate that PPG can reliably estimate PR in dynamic in-vehicle conditions, irrespective of the signal processing pipeline employed. In contrast, PRV, a substitute for ECG-derived HRV, exhibits considerably lower consistency and accuracy. Although some time-domain and frequency-domain PRV metrics showed moderate correlation with their ECG-derived counterparts, high-frequency metrics, such as RMSSD and HF power, were notably unreliable and prone to overestimation.
The discrepancies between PRV and HRV stem from both technical and physiological factors. Technically, motion artifacts and ambient noise distort the PPG waveform, particularly in mobile environments, impeding accurate detection of fiducial points. Physiologically, pulse transit time delays between the heart and peripheral measurement sites (e.g., wrist) introduce additional variability absent in ECG-derived signals. Moreover, vascular dynamics and local autonomic responses at the peripheral site can further distort high-frequency HRV components. These factors collectively contribute to significant deviations in PRV, particularly in metrics sensitive to short-term autonomic fluctuations.
A key challenge identified in this study is the trade-off between data quantity and signal quality. Increasing the number of epochs maximizes data availability but also raises the risk of including noisy or artifact-laden segments. To address this, we applied SQ thresholding and frequency-domain signal reconstruction. These methods improved PRV extraction by filtering out unreliable epochs and recovering usable signal components. Applying the SQ classifier prior to reconstruction reduced the proportion of motion-contaminated epochs from roughly 95% valid (MSPTD) to 67%, yet substantially improved HRV accuracy: the mean Pearson r increased from 0.31 to 0.53 and Cliff’s δ decreased from 0.58 to 0.42. When frequency-domain reconstruction was subsequently applied, the valid-epoch percentage recovered to 70% and the HRV correlation further improved to r = 0.57 with δ = 0.29, halving the average bias (from 110.83% to 55.90%). A similar pattern held for the MNDEI pipeline, where the full SQA + reconstruction configuration yielded the strongest HRV agreement (r = 0.79, δ = 0.21) while maintaining ~70% of analyzable epochs. These results demonstrate that lightweight quality control and spectral reconstruction substantially mitigate motion artifacts and enhance reliability without increasing computational cost. However, even the most effective processing methods could not fully eliminate motion-induced noise. Moreover, aggressive thresholding, while improving precision, significantly reduced the number of usable epochs, particularly for HRV analysis. This highlights the delicate balance between accuracy and data coverage in PPG signal processing.
Our findings advance the literature by demonstrating that a lightweight, single-channel pipeline that combines feature-based signal-quality filtering, narrow-band spectral reconstruction, and robust peak detection (MNDEI) can substantially narrow the longstanding performance gap between wrist-worn PPG and gold-standard ECG in mobile, real-world contexts. Previous studies have consistently shown that while pulse rate estimates from PPG align well with ECG under resting conditions, PRV performance deteriorates markedly even with mild movement or speech. In such dynamic conditions, PRV correlations with ECG-derived HRV often fall below 0.4, with data-loss rates exceeding 60% during ambulatory tasks or treadmill exercise [41,56,57,58,59]. Recent advances in motion-compensation, such as multi-sensor adaptive filtering [26,27,28,29,31] and deep-learning-generative models [34,35], have improved PRV estimation but typically incur significant trade-offs. These methods require additional data channels, require large and labeled datasets for training, and often lack interpretability. Moreover, many methods fail to define objective thresholds for unrecoverable signals, allowing poor quality segments to persist after post-processing. In contrast, our combined SQ-threshold + reconstruction approach achieved a median of ~75% of usable epochs, while elevating PRV correlations for time-domain and low-frequency HRV metrics into the moderate-to-strong range (r = 0.6–0.8). Importantly, this was accomplished without sacrificing bias, which remained comparable to Empatica’s processed IBI stream. By replacing subjective manual annotation [36,37,38] with objective, ECG-derived ground-truth labeling, our approach eliminates inconsistency in SQA classification and enables reliable, reproducible filtering of low-quality data. These findings underscore that well-calibrated, interpretable preprocessing pipelines can offer meaningful performance gains without the computational overhead of black-box or multi-sensor solutions. As such, this work enhances the practical utility of wrist-based PPG for estimating cardiovascular metrics in in-vehicle and other ecologically valid, dynamic environments.
These findings have important implications for the application of PPG in wearable health monitoring, particularly in field contexts. It is important to note, however, that the present study was conducted under predominantly cyclical longitudinal accelerations, with only a single controlled lateral maneuver per trial. As such, the motion profiles differ from those encountered in more naturalistic driving, which typically includes frequent turns, variable terrain, and passenger interactions. This may limit the generalizability of the current results. Future studies should incorporate a broader range of real-world driving dynamics to better characterize motion-induced artifacts and assess the robustness of PPG processing pipelines under more diverse conditions. Continued work is needed to develop more robust noise-reduction strategies, including context-aware filtering, sensor fusion with accelerometry, deep learning-based denoising, and hardware innovations aimed at improving signal fidelity during motion and in-vehicle operation.
Finally, our study underscores the need for standardized protocols for PPG acquisition and processing for dynamic conditions, such as in-vehicle environments. Variability in device placement, sensor specifications, and preprocessing methods remains a significant barrier to cross-study comparability and broad clinical adoption. Establishing community-accepted benchmarks and open datasets will be essential for advancing the reproducibility and reliability of PPG-based physiological monitoring.

5. Limitation

This study used the Empatica E4 wristband as the source of all PPG recordings. Although widely adopted in prior research, the E4 and its software suite were discontinued in February 2025, and several evaluations have noted limitations in its PPG module, including reduced accuracy during movement, sensitivity to environmental noise, and relatively low sampling rates (64 Hz for PPG; 32 Hz for accelerometry). These constraints may have affected the fidelity of pulse-derived features in this study, particularly high-frequency HRV metrics that depend on precise waveform morphology. While the discontinuation does not affect the integrity of the present dataset or the performance gains achieved when applying the proposed pipeline to raw E4 PPG output, it may limit direct replication. Differences across newer wrist-worn PPG devices, such as optical design, sampling rate, and firmware, should be considered when applying or extending the present signal-quality assessment and reconstruction pipeline.

6. Conclusions

In conclusion, PPG remains a valuable tool for comfortable and low-cost heart rate monitoring, but its use as a reliable surrogate for HRV, particularly in dynamic environments, is still limited. Our results show that while signal processing improvements can enhance HR estimates and modestly improve PRV accuracy, substantial challenges remain for high-frequency HRV metrics. Future advancements in signal quality assessment, reconstruction, and denoising, coupled with standardization efforts, are critical to unlocking the full potential of PPG for real-world, wearable health applications.

Author Contributions

Conceptualization, R.G. and M.L.H.J.; methodology, R.G., C.S.M., and C.W.S.; software, R.G. and C.W.S.; validation, R.G.; formal analysis, R.G.; investigation, R.G.; resources, M.L.H.J.; data curation, C.S.M.; writing—original draft preparation, R.G.; writing—review and editing, R.G., C.S.M., B.T.W.L., C.W.S., and M.L.H.J.; visualization, R.G.; supervision, B.T.W.L. and M.L.H.J.; funding acquisition, M.L.H.J. All authors have read and agreed to the published version of the manuscript.

Funding

This project was funded by the Ford–University of Michigan Alliance.

Institutional Review Board Statement

The study protocol was reviewed and approved by the University of Michigan Institutional Review Board for Health Behavior and Health Sciences (HUM00206184).

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Acknowledgments

We gratefully acknowledge the talented and dedicated staff and students at the University of Michigan Transportation Research Institute (UMTRI) who played a key role in the success of this project, including those who led the design and installation of the instrumentation and data acquisition system, as well as the team responsible for data collection.

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

References

  1. Kusumoto, F. ECG Interpretation: From Pathophysiology to Clinical Application; Springer Nature: New York, NY, USA, 2020. [Google Scholar]
  2. Immanuel, S.; Teferra, M.N.; Baumert, M.; Bidargaddi, N. Heart rate variability for evaluating psychological stress changes in healthy adults: A scoping review. Neuropsychobiology 2023, 82, 187–202. [Google Scholar] [CrossRef]
  3. Lu, K.; Dahlman, A.S.; Karlsson, J.; Candefjord, S. Detecting driver fatigue using heart rate variability: A systematic review. Accid. Anal. Prev. 2022, 178, 106830. [Google Scholar] [CrossRef] [PubMed]
  4. Su, H.; Qing, X.; He, X.; Wang, J.; Cai, H.; Wang, X.; Xu, L.; Jiang, J.; Xie, R.; Fan, S.; et al. Field Experiment on Wearable EEG/PPG-based Monitoring of Mental Fatigue in Drillers: Impact of Prolonged Working Hours and Night Shifts. SSRN J. 2024. [Google Scholar] [CrossRef]
  5. Peng, Y.; Zhou, J.; Fan, C.; Wu, Z.; Zhou, W.; Sun, D.; Lin, Y.; Xu, D.; Xu, Q. A review of passenger ride comfort in railway: Assessment and improvement method. Transp. Saf. Environ. 2022, 4, tdac016. [Google Scholar] [CrossRef]
  6. Böttcher, S.; Vieluf, S.; Bruno, E.; Joseph, B.; Epitashvili, N.; Biondi, A.; Zabler, N.; Glasstetter, M.; Dümpelmann, M.; Van Laerhoven, K.; et al. Data quality evaluation in wearable monitoring. Sci. Rep. 2022, 12, 21412. [Google Scholar] [CrossRef]
  7. Koerber, D.; Khan, S.; Shamsheri, T.; Kirubarajan, A.; Mehta, S. Accuracy of Heart Rate Measurement with Wrist-Worn Wearable Devices in Various Skin Tones: A Systematic Review. J. Racial Ethn. Health Disparities 2023, 10, 2676–2684. [Google Scholar] [CrossRef] [PubMed]
  8. D’Acquisto, L.; Scardulla, F.; Montinaro, N.; Pasta, S.; Zangla, D.; Bellavia, D. A preliminary investigation of the effect of contact pressure on the accuracy of heart rate monitoring by wearable PPG wrist band. In Proceedings of the 2019 II Workshop on Metrology for Industry 4.0 and IoT (MetroInd4.0&IoT), Naples, Italy, 4–6 June 2019; pp. 334–338. [Google Scholar]
  9. Charlton, P.H.; Pilt, K.; Kyriacou, P.A. Establishing best practices in photoplethysmography signal acquisition and processing. Physiol. Meas. 2022, 43, 050301. [Google Scholar] [CrossRef]
  10. Wang, K.; Tan, B.; Wang, X.; Qiu, S.; Zhang, Q.; Wang, S.; Yen, Y.-T.; Jing, N.; Liu, C.; Chen, X.; et al. Machine Learning-Assisted Point-of-Care Diagnostics for Cardiovascular Healthcare. Bioeng. Transl. Med. 2025, 10, e70002. [Google Scholar] [CrossRef] [PubMed]
  11. Grant, A.O. Cardiac Ion Channels. Circ. Arrhythm. Electrophysiol. 2009, 2, 185–194. [Google Scholar] [CrossRef]
  12. Deserno, T.; Marx, N. Computational Electrocardiography: Revisiting Holter ECG Monitoring. Methods Inf. Med. 2016, 55, 305–311. [Google Scholar] [CrossRef]
  13. Vavrinsky, E.; Subjak, J.; Donoval, M.; Wagner, A.; Zavodnik, T.; Svobodova, H. Application of modern multi-sensor holter in diagnosis and treatment. Sensors 2020, 20, 2663. [Google Scholar] [CrossRef] [PubMed]
  14. Hirokawa, J.; Hitosugi, T.; Miki, Y.; Miki, Y.; Tsukamoto, M.; Yamasaki, F.; Kawakubo, Y.; Yokoyama, T. The influence of electrocardiogram (ECG) filters on the heights of R and T waves in children. Sci. Rep. 2022, 12, 13279. [Google Scholar] [CrossRef]
  15. Kligfield, P.; Gettes, L.S.; Bailey, J.J.; Childers, R.; Deal, B.J.; Hancock, E.W.; Van Herpen, G.; Kors, J.A.; Macfarlane, P.; Mirvis, D.M.; et al. Recommendations for the Standardization and Interpretation of the Electrocardiogram: Part I: The Electrocardiogram and Its Technology: A Scientific Statement From the American Heart Association Electrocardiography and Arrhythmias Committee, Council on Clinical Cardiology; the American College of Cardiology Foundation; and the Heart Rhythm Society: Endorsed by the International Society for Computerized Electrocardiology. Circulation 2007, 115, 1306–1324. [Google Scholar] [CrossRef]
  16. Pinna, G.D.; Maestri, R.; Di Cesare, A.; Colombo, R.; Minuco, G. The accuracy of power-spectrum analysis of heart-rate variability from annotated RR lists generated by Holter systems. Physiol. Meas. 1994, 15, 163. [Google Scholar] [CrossRef] [PubMed]
  17. Luo, S.; Johnston, P. A review of electrocardiogram filtering. J. Electrocardiol. 2010, 43, 486–496. [Google Scholar] [CrossRef]
  18. Nakagawa, M.; Tsunemitsu, C.; Katoh, S.; Kamiyama, Y.; Sano, N.; Ezaki, K.; Miyazaki, H.; Teshima, Y.; Yufu, K.; Takahashi, N.; et al. Effect of ECG filter settings on J-waves. J. Electrocardiol. 2014, 47, 7–11. [Google Scholar] [CrossRef] [PubMed]
  19. Fariha, M.A.Z.; Ikeura, R.; Hayakawa, S.; Tsutsumi, S. Analysis of Pan-Tompkins algorithm performance with noisy ECG signals. J. Phys. Conf. Ser. 2020, 1532, 012022. [Google Scholar] [CrossRef]
  20. Jovic, A.; Bogunovic, N. Electrocardiogram analysis using a combination of statistical, geometric, and nonlinear heart rate variability features. Artif. Intell. Med. 2011, 51, 175–186. [Google Scholar] [CrossRef]
  21. Laborde, S.; Mosley, E.; Thayer, J.F. Heart Rate Variability and Cardiac Vagal Tone in Psychophysiological Research—Recommendations for Experiment Planning, Data Analysis, and Data Reporting. Front. Psychol. 2017, 8, 213. [Google Scholar] [CrossRef] [PubMed]
  22. Park, J.; Seok, H.S.; Kim, S.-S.; Shin, H. Photoplethysmogram Analysis and Applications: An Integrative Review. Front. Physiol. 2022, 12, 808451. [Google Scholar] [CrossRef] [PubMed]
  23. Castaneda, D.; Esparza, A.; Ghamari, M.; Soltanpur, C.; Nazeran, H. A review on wearable photoplethysmography sensors and their potential future applications in health care. Int. J. Biosens. Bioelectron. 2018, 4, 195. [Google Scholar] [CrossRef]
  24. Hartmann, V.; Liu, H.; Chen, F.; Qiu, Q.; Hughes, S.; Zheng, D. Quantitative comparison of photoplethysmographic waveform characteristics: Effect of measurement site. Front. Physiol. 2019, 10, 198. [Google Scholar] [CrossRef]
  25. Fine, J.; Branan, K.L.; Rodriguez, A.J.; Boonya-Ananta, T.; Ajmal; Ramella-Roman, J.C.; McShane, M.J.; Cote, G.L. Sources of inaccuracy in photoplethysmography for continuous cardiovascular monitoring. Biosensors 2021, 11, 126. [Google Scholar] [CrossRef] [PubMed]
  26. Pan, H.; Temel, D.; AlRegib, G. HeartBEAT: Heart beat estimation through adaptive tracking. In Proceedings of the 2016 IEEE-EMBS International Conference on Biomedical and Health Informatics (BHI), Las Vegas, NV, USA, 24–27 February 2016; pp. 587–590. [Google Scholar]
  27. Islam, M.T.; Tanvir Ahmed, S.; Zabir, I.; Shahnaz, C.; Fattah, S.A. Cascade and parallel combination (CPC) of adaptive filters for estimating heart rate during intensive physical exercise from photoplethysmographic signal. Healthc. Technol. Lett. 2018, 5, 18–24. [Google Scholar] [CrossRef]
  28. Ram, M.R.; Madhav, K.V.; Krishna, E.H.; Komalla, N.R.; Reddy, K.A. A novel approach for motion artifact reduction in PPG signals based on AS-LMS adaptive filter. IEEE Trans. Instrum. Meas. 2011, 61, 1445–1457. [Google Scholar] [CrossRef]
  29. Lee, J.; Kim, M.; Park, H.-K.; Kim, I.Y. Motion artifact reduction in wearable photoplethysmography based on multi-channel sensors with multiple wavelengths. Sensors 2020, 20, 1493. [Google Scholar] [CrossRef]
  30. Islam, M.S.; Shifat-E-Rabbi, M.; Dobaie, A.M.A.; Hasan, M.K. PREHEAT: Precision heart rate monitoring from intense motion artifact corrupted PPG signals using constrained RLS and wavelets. Biomed. Signal Process. Control 2017, 38, 212–223. [Google Scholar] [CrossRef]
  31. Ye, Y.; Cheng, Y.; He, W.; Hou, M.; Zhang, Z. Combining nonlinear adaptive filtering and signal decomposition for motion artifact removal in wearable photoplethysmography. IEEE Sens. J. 2016, 16, 7133–7141. [Google Scholar] [CrossRef]
  32. Rao, B.V.; Krishna, E.H.; Reddy, K.A. Wavelet transform generated inherent noise reference for adaptive filtering to de-noise pulse oximeter signals. Serbian J. Electr. Eng. 2024, 21, 251–273. [Google Scholar] [CrossRef]
  33. Wan, C.; Chen, D.; Yang, J.; Huang, M. Combining parallel adaptive filtering and wavelet threshold denoising for photoplethysmography-based pulse rate monitoring during intensive physical exercise. IEICE Trans. Inf. Syst. 2020, 103, 612–620. [Google Scholar] [CrossRef]
  34. Afandizadeh Zargari, A.H.; Aqajari, S.A.H.; Khodabandeh, H.; Rahmani, A.; Kurdahi, F. An Accurate Non-accelerometer-based PPG Motion Artifact Removal Technique using CycleGAN. ACM Trans. Comput. Healthc. 2023, 4, 1–14. [Google Scholar] [CrossRef]
  35. Shin, H. Deep convolutional neural network-based signal quality assessment for photoplethysmogram. Comput. Biol. Med. 2022, 145, 105430. [Google Scholar] [CrossRef] [PubMed]
  36. Orphanidou, C. Quality Assessment for the Photoplethysmogram (PPG). In Signal Quality Assessment in Physiological Monitoring; Springer International Publishing: Cham, Switzerland, 2018; pp. 41–63. [Google Scholar] [CrossRef]
  37. Mohagheghian, F.; Han, D.; Peitzsch, A.; Nishita, N.; Ding, E.; Dickson, E.L.; DiMezza, D.; Otabil, E.M.; Noorishirazi, K.; Scott, J.; et al. Optimized signal quality assessment for photoplethysmogram signals using feature selection. IEEE Trans. Biomed. Eng. 2022, 69, 2982–2993. [Google Scholar] [CrossRef] [PubMed]
  38. Moscato, S.; Lo Giudice, S.; Massaro, G.; Chiari, L. Wrist photoplethysmography signal quality assessment for reliable heart rate estimate and morphological analysis. Sensors 2022, 22, 5831. [Google Scholar] [CrossRef]
  39. Mejia-Mejia, E.; Allen, J.; Budidha, K.; El-Hajj, C.; Kyriacou, P.A.; Charlton, P.H. Photoplethysmography signal processing and synthesis. In Photoplethysmography; Elsevier: Amsterdam, The Netherlands, 2022; pp. 69–146. [Google Scholar]
  40. Charlton, P.H.; Kotzen, K.; Mejía-Mejía, E.; Aston, P.J.; Budidha, K.; Mant, J.; Pettit, C.; Behar, J.A.; Kyriacou, P.A. Detecting beats in the photoplethysmogram: Benchmarking open-source algorithms. Physiol. Meas. 2022, 43, 085007. [Google Scholar] [CrossRef]
  41. Zhang, Z.; Pi, Z.; Liu, B. TROIKA: A General Framework for Heart Rate Monitoring Using Wrist-Type Photoplethysmographic Signals During Intensive Physical Exercise. IEEE Trans. Biomed. Eng. 2015, 62, 522–531. [Google Scholar] [CrossRef] [PubMed]
  42. Islam, M.T.; Zabir, I.; Ahamed, S.T.; Yasar, M.T.; Shahnaz, C.; Fattah, S.A. A time-frequency domain approach of heart rate estimation from photoplethysmographic (PPG) signal. Biomed. Signal Process. Control 2017, 36, 146–154. [Google Scholar] [CrossRef]
  43. Temko, A. Accurate heart rate monitoring during physical exercises using PPG. IEEE Trans. Biomed. Eng. 2017, 64, 2016–2024. [Google Scholar] [CrossRef]
  44. Ichimaru, Y.; Moody, G.B. MIT-BIH Polysomnographic Database, Version 1.0.0. Physionet. 1992. Available online: https://physionet.org/content/slpdb/1.0.0/ (accessed on 19 January 2025).
  45. Pimentel, M.; Johnson, A.E.W.; Charlton, P.; Clifton, D. BIDMC PPG and Respiration Dataset, Version 1.0.0. Physionet. 2018. Available online: https://physionet.org/content/bidmc/1.0.0/ (accessed on 19 January 2025).
  46. Karlen, W. CSL Pulse Oximetry Artifact Labels, Version 1. Borealis. 2021. Available online: https://borealisdata.ca/dataset.xhtml?persistentId=doi:10.5683/SP2/SJAKCB (accessed on 19 January 2025).
  47. Tan, C.W.; Bergmeir, C.; Petitjean, F.; Webb, G.I. IEEEPPG Dataset. 2020. Available online: https://zenodo.org/records/3902710 (accessed on 19 January 2025).
  48. Jarchi, D.; Casson, A.J. Description of a database containing wrist PPG signals recorded during physical exercise with both accelerometer and gyroscope measures of motion. Data 2016, 2, 1. [Google Scholar] [CrossRef]
  49. Karlen, W. CapnoBase IEEE TBME Respiratory Rate Benchmark. 2021. Available online: https://borealisdata.ca/dataset.xhtml?persistentId=doi:10.5683/SP2/NLB8IT (accessed on 19 January 2025).
  50. Moody, B.; Moody, G.; Villarroel, M.; Clifford, G.D.; Silva, I. MIMIC-III Waveform Database. 2017. Available online: https://physionet.org/content/mimic3wdb/1.0/ (accessed on 19 January 2025).
  51. Choi, A.; Shin, H. Photoplethysmography sampling frequency: Pilot assessment of how low can we go to analyze pulse rate variability with reliability? Physiol. Meas. 2017, 38, 586. [Google Scholar] [CrossRef]
  52. Ban, D.; Kwon, S. Movement noise cancellation in PPG signals. In Proceedings of the 2016 IEEE International Conference on Consumer Electronics (ICCE), Las Vegas, NV, USA, 7–11 January 2016; pp. 47–48. [Google Scholar]
  53. Mejía-Mejía, E.; May, J.M.; Torres, R.; Kyriacou, P.A. Pulse rate variability in cardiovascular health: A review on its applications and relationship with heart rate variability. Physiol Meas 2020, 41, 07TR01. [Google Scholar] [CrossRef]
  54. Taoum, A.; Bisiaux, A.; Tilquin, F.; Le Guillou, Y.; Carrault, G. Validity of ultra-short-term hrv analysis using ppg—A preliminary study. Sensors 2022, 22, 7995. [Google Scholar] [CrossRef]
  55. Stuyck, H.; Dalla Costa, L.; Cleeremans, A.; Van den Bussche, E. Validity of the Empatica E4 wristband to estimate resting-state heart rate variability in a lab-based context. Int. J. Psychophysiol. 2022, 182, 105–118. [Google Scholar] [CrossRef]
  56. Menghini, L.; Gianfranchi, E.; Cellini, N.; Patron, E.; Tagliabue, M.; Sarlo, M. Stressing the accuracy: Wrist-worn wearable sensor validation over different conditions. Psychophysiology 2019, 56, e13441. [Google Scholar] [CrossRef]
  57. Milstein, N.; Gordon, I. Validating measures of electrodermal activity and heart rate variability derived from the empatica E4 utilized in research settings that involve interactive dyadic states. Front. Behav. Neurosci. 2020, 14, 148. [Google Scholar] [CrossRef]
  58. Schuurmans, A.A.; de Looff, P.; Nijhof, K.S.; Rosada, C.; Scholte, R.H.; Popma, A.; Otten, R. Validity of the Empatica E4 wristband to measure heart rate variability (HRV) parameters: A comparison to electrocardiography (ECG). J. Med. Syst. 2020, 44, 190. [Google Scholar] [CrossRef]
  59. Van Voorhees, E.E.; Dennis, P.A.; Watkins, L.L.; Patel, T.A.; Calhoun, P.S.; Dennis, M.F.; Beckham, J.C. Ambulatory heart rate variability monitoring: Comparisons between the Empatica E4 wristband and Holter electrocardiogram. Psychosom. Med. 2022, 84, 210–214. [Google Scholar] [CrossRef]
  60. Mejía-Mejía, E.; Budidha, K.; Abay, T.Y.; May, J.M.; Kyriacou, P.A. Heart rate variability (HRV) and pulse rate variability (PRV) for the assessment of autonomic responses. Front. Physiol. 2020, 11, 534985. [Google Scholar] [CrossRef] [PubMed]
  61. Yuda, E.; Shibata, M.; Ogata, Y.; Ueda, N.; Yambe, T.; Yoshizawa, M.; Hayano, J. Pulse rate variability: A new biomarker, not a surrogate for heart rate variability. J. Physiol. Anthropol. 2020, 39, 21. [Google Scholar] [CrossRef] [PubMed]
  62. Winslow, B.D.; Chadderdon, G.L.; Dechmerowski, S.J.; Jones, D.L.; Kalkstein, S.; Greene, J.L.; Gehrman, P. Development and clinical evaluation of an mHealth application for stress management. Front. Psychiatry 2016, 7, 130. [Google Scholar] [CrossRef] [PubMed]
  63. Krishnan, R.; Natarajan, B.; Warren, S. Two-stage approach for detection and reduction of motion artifacts in photoplethysmographic data. IEEE Trans. Biomed. Eng. 2010, 57, 1867–1876. [Google Scholar] [CrossRef]
  64. Selvaraj, N.; Mendelson, Y.; Shelley, K.H.; Silverman, D.G.; Chon, K.H. Statistical approach for the detection of motion/noise artifacts in Photoplethysmogram. In Proceedings of the 2011 Annual International Conference of the IEEE Engineering in Medicine and Biology Society, Boston, MA, USA, 30 August–3 September 2011; pp. 4972–4975. [Google Scholar] [CrossRef]
  65. Dubey, H.; Kumaresan, R.; Mankodiya, K. Harmonic sum-based method for heart rate estimation using PPG signals affected with motion artifacts. J. Ambient. Intell. Humaniz. Comput. 2018, 9, 137–150. [Google Scholar] [CrossRef]
  66. Kim, S.; Im, S.; Park, T. Characterization of quadratic nonlinearity between motion artifact and acceleration data and its application to heartbeat rate estimation. Sensors 2017, 17, 1872. [Google Scholar] [CrossRef] [PubMed]
  67. Dao, D.; Salehizadeh, S.M.; Noh, Y.; Chong, J.W.; Cho, C.H.; McManus, D.; Darling, C.E.; Mendelson, Y.; Chon, K.H. A robust motion artifact detection algorithm for accurate detection of heart rates from photoplethysmographic signals using time–frequency spectral features. IEEE J. Biomed. Health Inform. 2016, 21, 1242–1253. [Google Scholar] [CrossRef] [PubMed]
  68. Krishnan, R.; Natarajan, B.; Warren, S. Analysis and detection of motion artifact in photoplethysmographic data using higher order statistics. In Proceedings of the 2008 IEEE International Conference on Acoustics, Speech and Signal Processing, Las Vegas, NV, USA, 31 March–4 April 2008; pp. 613–616. [Google Scholar]
  69. Ahamed, S.T.; Islam, M.T. An efficient method for heart rate monitoring using wrist-type photoplethysmographic signals during intensive physical exercise. In Proceedings of the 2016 5th International Conference on Informatics, Electronics and Vision (ICIEV), Dhaka, Bangladesh, 13–14 May 2016; pp. 863–868. [Google Scholar]
  70. Peng, H.; Long, F.; Ding, C. Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 2005, 27, 1226–1238. [Google Scholar] [CrossRef]
  71. Wang, M.; Li, Z.; Zhang, Q.; Wang, G. Removal of motion artifacts in photoplethysmograph sensors during intensive exercise for accurate heart rate calculation based on frequency estimation and notch filtering. Sensors 2019, 19, 3312. [Google Scholar] [CrossRef]
  72. Bishop, S.M.; Ercole, A. Multi-scale peak and trough detection optimised for periodic and quasi-periodic neuroscience data. In Intracranial Pressure & Neuromonitoring XVI; Springer: Cham, Switzerland, 2018; pp. 189–195. [Google Scholar]
  73. Vest, A.N.; Da Poian, G.; Li, Q.; Liu, C.; Nemati, S.; Shah, A.J.; Clifford, G.D. An open-source benchmarked toolbox for cardiovascular waveform and interval analysis. Physiol. Meas. 2018, 39, 105004. [Google Scholar] [CrossRef]
  74. Cajal, D.; Hernando, D.; Lázaro, J.; Laguna, P.; Gil, E.; Bailón, R. Effects of missing data on heart rate variability metrics. Sensors 2022, 22, 5774. [Google Scholar] [CrossRef]
Figure 1. Representative ECG and PPG-derived signals from a single participant over a 5-s window. Panels show (top to bottom): BioHarness ECG waveform with annotated fiducial points, E4 BVP signal with systolic and diastolic peaks, the first derivative of the BVP (dBVP/dt), and the second derivative of the BVP (d2BVP/dt2).
Figure 1. Representative ECG and PPG-derived signals from a single participant over a 5-s window. Panels show (top to bottom): BioHarness ECG waveform with annotated fiducial points, E4 BVP signal with systolic and diastolic peaks, the first derivative of the BVP (dBVP/dt), and the second derivative of the BVP (d2BVP/dt2).
Sensors 25 07556 g001
Figure 2. Data processing and evaluation flowchart. Empatica BVP-derived pulse rate (PR) estimates are compared with BioHarness ECG-derived heart rate (HR) estimates, with epochs showing <5 BPM error used to train the signal-quality classifier (green). PRV and HRV features (orange) are then extracted for method evaluation.
Figure 2. Data processing and evaluation flowchart. Empatica BVP-derived pulse rate (PR) estimates are compared with BioHarness ECG-derived heart rate (HR) estimates, with epochs showing <5 BPM error used to train the signal-quality classifier (green). PRV and HRV features (orange) are then extracted for method evaluation.
Sensors 25 07556 g002
Figure 3. Procedures of the signal quality assessment (SQA) classification model. Left: Blood volume pulse (BVP) signal processing. Right: Accelerometer signal processing. Bottom: Extracted features from both modalities at different stages used as input to the SQA classifier. Blue elements denote features derived from the BVP signal, and pink elements denote features derived from the accelerometer signal. Lighter hues represent the original time-domain signals, and darker hues represent derived or frequency-domain representations.
Figure 3. Procedures of the signal quality assessment (SQA) classification model. Left: Blood volume pulse (BVP) signal processing. Right: Accelerometer signal processing. Bottom: Extracted features from both modalities at different stages used as input to the SQA classifier. Blue elements denote features derived from the BVP signal, and pink elements denote features derived from the accelerometer signal. Lighter hues represent the original time-domain signals, and darker hues represent derived or frequency-domain representations.
Sensors 25 07556 g003
Figure 4. An illustration of PPG signal reconstruction using spectral peak information. The first row shows the original PPG waveform in the time-domain (left) and frequency-domain (right). The second row presents the reconstructed signal, with only information corresponding to the heart rate frequency and its harmonics retained. Red plus signs indicate the detected systolic peaks (beat locations) in the reconstructed waveform.
Figure 4. An illustration of PPG signal reconstruction using spectral peak information. The first row shows the original PPG waveform in the time-domain (left) and frequency-domain (right). The second row presents the reconstructed signal, with only information corresponding to the heart rate frequency and its harmonics retained. Red plus signs indicate the detected systolic peaks (beat locations) in the reconstructed waveform.
Sensors 25 07556 g004
Figure 5. Violin plot illustrating the distribution of the percentage of epochs retained for HRV analysis across trials, grouped by processing method. For all methods except the E4 HR, only epochs where at least 50% of the IBIs are available are used for calculating heart rate and heart rate variability metrics. For the E4 HR method, only the mean heart rate is analyzed without IBI data, so all epochs are included regardless of IBI coverage.
Figure 5. Violin plot illustrating the distribution of the percentage of epochs retained for HRV analysis across trials, grouped by processing method. For all methods except the E4 HR, only epochs where at least 50% of the IBIs are available are used for calculating heart rate and heart rate variability metrics. For the E4 HR method, only the mean heart rate is analyzed without IBI data, so all epochs are included regardless of IBI coverage.
Sensors 25 07556 g005
Figure 6. Pearson correlation between ECG-derived HRV and PPG-derived PRV metrics. The heatmap matrix of Pearson correlation coefficients across multiple methods. Cells labeled “NA” indicate HRV metrics that are not available as outputs from Empatica E4. All Pearson correlation tests have a p-value of < 0.001.
Figure 6. Pearson correlation between ECG-derived HRV and PPG-derived PRV metrics. The heatmap matrix of Pearson correlation coefficients across multiple methods. Cells labeled “NA” indicate HRV metrics that are not available as outputs from Empatica E4. All Pearson correlation tests have a p-value of < 0.001.
Sensors 25 07556 g006
Figure 7. Cliff’s δ effect size between ECG and PPG-derived HRV metrics. Heatmap matrix displaying Cliff’s δ values for each HRV metric (HR, SDNN, RMSSD, ULF, VLF, LF, HF) across multiple PPG processing methods. Cells labeled “NA” indicate HRV metrics that are not available as outputs from Empatica E4. Values indicate the degree of distributional non-overlap, where δ ≥ 0.474 denotes a large effect size.
Figure 7. Cliff’s δ effect size between ECG and PPG-derived HRV metrics. Heatmap matrix displaying Cliff’s δ values for each HRV metric (HR, SDNN, RMSSD, ULF, VLF, LF, HF) across multiple PPG processing methods. Cells labeled “NA” indicate HRV metrics that are not available as outputs from Empatica E4. Values indicate the degree of distributional non-overlap, where δ ≥ 0.474 denotes a large effect size.
Sensors 25 07556 g007
Figure 8. Bland–Altman analysis of PPG- and ECG-derived HRV metrics. Median bias and nonparametric 95% limits of agreement (LOA) between HRV and PRV across the processing methods. The figure shows the median difference (PRV-HRV) as a central dot, with error bars representing the 95% LOAs.
Figure 8. Bland–Altman analysis of PPG- and ECG-derived HRV metrics. Median bias and nonparametric 95% limits of agreement (LOA) between HRV and PRV across the processing methods. The figure shows the median difference (PRV-HRV) as a central dot, with error bars representing the 95% LOAs.
Sensors 25 07556 g008
Table 1. Established ECG signal processing methods.
Table 1. Established ECG signal processing methods.
TypeStandardsReferences
Sampling RatesOptimal: 250–500 Hz. Minimum: 100 Hz with parabolic interpolation.[16]
Low Frequency FilteringOptimal: 0.05 Hz.[17]
High Frequency FilteringAdults: 150 Hz. Children: 250 Hz.[14,18]
Fiducial Point IdentificationUse a well-tested algorithm (derivative + threshold, template, or correlation method) to locate a stable, noise-independent reference point.[19]
Feature Extraction- Heart rate.
- Time-domain HRV: SDNN, RMSSD, pNN50.
- Frequency-domain HRV: ULF, VLF, LF, HF, LF/HF.
- Nonlinear HRV: SD1 and SD2, approximate entropy, sample entropy, MSE, DFA.
[20]
Table 4. Cross-validation performance of signal quality assessment (SQ) classification models. Bolded entries indicate the best-performing classifier, as determined by the highest accuracy and balanced sensitivity/specificity.
Table 4. Cross-validation performance of signal quality assessment (SQ) classification models. Bolded entries indicate the best-performing classifier, as determined by the highest accuracy and balanced sensitivity/specificity.
Epoch (s)ClassifierOptimal Feature NOptimal CostAccuracySpecificitySensitivityMin of Specificity and Sensitivity
5Logistic1350.8740.8850.8710.871
SVM1050.8680.8880.8620.862
NB990.8720.8740.8720.872
LDA9100.8770.8760.8780.876
10Logistic430.8900.8780.8920.878
SVM830.8850.8840.8850.884
NB620.8700.8730.8690.869
LDA470.8900.8810.8920.881
20Logistic1220.8100.8110.8100.810
SVM720.7990.8240.7890.789
NB420.8050.8040.8050.804
LDA930.7980.8440.7800.780
30Logistic110.7500.6990.7800.699
SVM110.7450.7390.7490.739
NB110.7210.7150.7250.715
LDA110.7470.7150.7650.715
Table 5. Average performance across signal-processing pipelines. Cells labeled “NA” indicate HRV metrics that are not available as outputs from Empatica E4 and therefore cannot be included in the comparison analysis.
Table 5. Average performance across signal-processing pipelines. Cells labeled “NA” indicate HRV metrics that are not available as outputs from Empatica E4 and therefore cannot be included in the comparison analysis.
MethodValid EpochHeart RateHeart Rate Variability
Pearson rCliff’s δBiasPearson rCliff’s δBias
E4 HR100.00%0.83−0.01−0.77%NANANA
E4 IBI22.70%1.00−0.03−0.37%0.850.2547.43%
MSPTD94.70%0.98−0.01−0.25%0.310.58205.47%
SQA MSPTD66.80%1.00−0.02−0.30%0.530.42110.83%
SQA Recon MSPTD70.30%1.00−0.02−0.25%0.570.2955.90%
MNDEI64.30%0.99−0.04−0.60%0.510.44143.84%
SQA MNDEI60.90%1.00−0.03−0.49%0.640.3393.44%
SQA Recon MNDEI69.50%1.00−0.02−0.34%0.790.2141.66%
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Gao, R.; Miller, C.S.; Lin, B.T.W.; Schwarz, C.W.; Jones, M.L.H. Signal Quality Assessment and Reconstruction of PPG-Derived Signals for Heart Rate and Variability Estimation in In-Vehicle Applications: A Comparative Review and Empirical Validation. Sensors 2025, 25, 7556. https://doi.org/10.3390/s25247556

AMA Style

Gao R, Miller CS, Lin BTW, Schwarz CW, Jones MLH. Signal Quality Assessment and Reconstruction of PPG-Derived Signals for Heart Rate and Variability Estimation in In-Vehicle Applications: A Comparative Review and Empirical Validation. Sensors. 2025; 25(24):7556. https://doi.org/10.3390/s25247556

Chicago/Turabian Style

Gao, Ruimin, Carl S. Miller, Brian T. W. Lin, Chris W. Schwarz, and Monica L. H. Jones. 2025. "Signal Quality Assessment and Reconstruction of PPG-Derived Signals for Heart Rate and Variability Estimation in In-Vehicle Applications: A Comparative Review and Empirical Validation" Sensors 25, no. 24: 7556. https://doi.org/10.3390/s25247556

APA Style

Gao, R., Miller, C. S., Lin, B. T. W., Schwarz, C. W., & Jones, M. L. H. (2025). Signal Quality Assessment and Reconstruction of PPG-Derived Signals for Heart Rate and Variability Estimation in In-Vehicle Applications: A Comparative Review and Empirical Validation. Sensors, 25(24), 7556. https://doi.org/10.3390/s25247556

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop