1. Introduction
The respiratory rate (RR) is one of the most informative indicators of physiological state and a vital sign for the early detection of clinical deterioration. Continuous monitoring of respiration plays a crucial role in the diagnosis and management of a wide range of conditions. In healthy subjects at rest, normal RR ranges between 16 and 20 breaths per minute; an increased RR may indicate fever, dehydration, anxiety, mental stress, asthma, pulmonary diseases, or heart conditions [
1,
2].
Conversely, a decreased RR can be associated with drug use, metabolic abnormalities, sleep apnea events, and other disorders [
3]. In addition, the coupling between respiration and heart rate represents a key parameter both for the detection of pathological conditions [
4,
5,
6,
7] and for studying cardiorespiratory system interactions [
8].
In general hospital settings, respiration can be measured using both direct and indirect methods. Direct measurements involve assessing air flow, pressure, and temperature through the lungs using devices such as spirometers. Indirect measurements rely on thoracic volume changes, detected through transthoracic inductance or impedance plethysmography systems [
9].
However, these methods present several limitations—they are often time-consuming, costly, and require bulky equipment. Furthermore, contact-based sensors or masks may interfere with natural breathing patterns. The extraction of RR is also highly susceptible to motion artifacts, making these systems less suitable for applications that require continuous, unobtrusive monitoring, such as in polysomnography or stress assessment.
For these reasons, there is a clinical need for algorithms capable of extracting the RR from waveforms that are already collected in the hospital setting. One such approach involves estimating RR from electrocardiogram (ECG) and photoplethysmography (PPG) signals [
10,
11,
12,
13,
14]. Indeed, respiratory activity influences both ECG and PPG signals. In the case of the ECG signal, respiration affects heart rate variability as well as beat morphology. On the one hand, respiration modulates heart rate, which increases during inspiration and decreases during expiration, a phenomenon known as respiratory sinus arrhythmia (RSA) [
15,
16]. On the other hand, ECG morphology is also influenced by respiration due to relative movements of the electrodes with respect to the heart and changes in thoracic impedance caused by lung filling and emptying [
17]. As for PPG signals, the reduction in stroke volume during ventricular filling induces amplitude variations in the PPG signal, reflecting respiratory activity [
18].
Ultimately, both ECG and PPG signals are influenced by respiratory activity through three types of modulation—amplitude modulation (AM), baseline wander (BW), and frequency modulation (FM) [
19]. AM refers to variations in peak amplitude observed in both ECG and PPG signals. BW affects the low-frequency baseline of these signals, introducing slow fluctuations unrelated to physiological activity [
20,
21,
22]. Finally, FM appears as beat-to-beat variations in the duration of cardiac cycles, detectable in both ECG and PPG [
19,
23]. BW and AM in the ECG are also influenced by changes in the orientation of the heart’s electrical axis relative to the electrodes, as well as variations in thoracic impedance [
19]. BW in the PPG signal is caused by changes in tissue blood volume [
21], while AM in the PPG is associated with respiration-induced changes in stroke volume [
22]. FM in ECG and PPG is due to RSA, which causes the heart rate to increase during inspiration and decrease during exhalation [
19]. The intensity of each modulation can vary between subjects, depending on their physiological parameters and other characteristics such as gender, age, health condition, weight, and height [
19,
24].
Therefore, while ECG measures heart rhythm and electrical activity, and PPG is used for detecting blood volume changes in the microvascular bed via pulse oximetry, both signals are closely correlated with the RR and are routinely acquired in clinical settings. This makes them strong candidates for inexpensive and non-invasive RR monitoring. Over the past decade, Charlton et al. [
25] conducted a comprehensive comparison of over 100 algorithms for estimating RR from ECG and PPG signals. Their review covered a wide range of methods that use either ECG or PPG as input signals [
26,
27,
28,
29,
30]. Most of these algorithms focus on optimizing the filtering stages to enhance the signal-to-noise ratio, thereby improving the reconstruction of the respiratory signal [
31,
32,
33].
In contrast, the present study introduces a methodological pipeline designed to guide researchers in extracting RR from ECG or PPG signals while avoiding unnecessary processing steps. To ensure reproducibility and promote an adaptable workflow, we tested the proposed pipeline on two publicly available datasets: iAMwell [
34] and Capnobase [
35]. Specifically, we analyzed a wide range of morphological and temporal features derived from ECG signals (R-peak, QRS area, up-slope, down-slope) and PPG signals (frequency modulation and amplitude modulation) to identify and compare the most accurate and robust features for RR estimation, accounting for both intra- and inter-subject variability. The results provide valuable insights that can inform the development of efficient, non-invasive, and low-cost systems for continuous respiratory monitoring.
2. Materials and Methods
2.1. Dataset
We evaluated the proposed method on two publicly available datasets, the iAMwell open-source dataset [
34] and Capnobase [
35], to verify the generalization performance of the algorithm across different scenarios.
The iAMwell dataset contains simultaneously recorded ECG, PPG, and respiratory signals from athletes and non-athletes during an experimental protocol, including rest, exercise, and recovery phases. All signals were sampled at 2000 Hz. In this study, we focused on the acceleration phase, lasting approximately 8 min, characterized by a gradual increase in running speed and corresponding physiological changes.
The Capnobase dataset includes simultaneously recorded ECG, PPG, and respiratory signals from 42 subjects undergoing elective surgery and routine anesthesia. The recordings, lasting about 8 min, were sampled at 300 Hz. We considered only 19 spontaneous breathing cases for our analysis.
2.2. Signal Processing and Analysis
Before detailing the algorithms for ECG and PPG analysis, it is important to highlight that all records, including ECG, PPG, and respiratory signals, were divided into segments. To determine the optimal segment length for accurate RR prediction, we tested windows of 20, 30, and 60 s to allow a fair comparison. The 30-s window consistently provided the best accuracy in RR estimation, in agreement with previous literature, which demonstrated that longer windows tend to result in higher mean errors [
36,
37].
2.2.1. ECG Pre-Processing and Analysis
The following flowchart (
Figure 1) illustrates the ECG analysis step-by-step.
As is commonly done to improve the signal-to-noise ratio (SNR) and ensure accurate detection of the QRS complex while enhancing the quality of the extracted respiratory signal, denoising of the raw ECG signal was applied. Each step of the algorithm, shown in the workflow of
Figure 1, follows traditional signal processing methodologies widely discussed in the literature [
38,
39,
40]. The selection of parameters for each processing step was made empirically, based on the characteristics of the ECG signal and the specific requirements of the analysis.
In the preprocessing stage, a cascade of low-pass and high-pass filters was used to remove the P and T waves, as well as electromyography artifacts. The cut-off frequencies for the band-pass filter were set to 5 Hz (low-pass) and 40 Hz (high-pass). The next step, QRS complex detection, can be challenging due to physiological variability and various types of noise present in the ECG signal. The Pan–Tompkins algorithm [
41], known for its robustness in detecting QRS complexes, was employed. The Q and S waves were identified by searching for local minima before and after the R-peak, as illustrated in
Figure 2.
For QRS features extraction, four morphological features were extracted: R-peak (amplitude), QRS slopes (up-slope and down-slope), and QRS area, as described in [
42]. Let us denote by
the
i-th QRS complex of the total ECG signal (
Figure 2); the computation of the morphological features is outlined as follows (
Figure 3):
The area is calculated by approximating the integral with trapezoids. The following equation describes the total area used in this study:
Finally, based on these features, the ECG-derived respiration (EDR) signals were extracted. To reconstruct a continuous respiration waveform, cubic spline interpolation was applied to the detected peaks of these features. Since different QRS parameters are influenced by respiration in varying ways, a separate EDR signal was generated for each extracted QRS feature, enabling a more comprehensive representation of the respiratory pattern.
2.2.2. PPG Pre-Processing and Analysis
Figure 4 illustrates the step-by-step processing of the PPG signal. Similar to the ECG processing, the methodologies for PPG signal processing were based on established techniques from the literature, and the parameters were set empirically.
The raw PPG signal was first filtered using an 8th-order Butterworth low-pass filter with a cut-off frequency of 10 Hz to remove high-frequency noise. Next, an 11-level wavelet decomposition was applied to further smooth the signal, as recommended in the literature for PPG preprocessing [
43,
44,
45,
46]. To enhance the extraction of respiratory-related variations, a Savitzky–Golay polynomial filter was employed [
47] for more precise detection of RR from the reference signals.
Subsequently, the systolic maximum and minimum values of the PPG waveform, along with the intervals between them, were detected using standard MATLAB toolboxes. These features were extracted from both the AM and FM components of the filtered PPG signals (
Figure 5). Finally, as with the ECG, cubic spline interpolation was used to reconstruct the PPG-derived respiration (PDR) signal.
2.3. Respiratory Rate Extraction
In each case, RR was estimated by counting the number of respiratory peaks within a fixed 30-s time window. RR was computed for each reference respiratory signal and derived from the individual ECG and PPG characteristics, using the following formula:
where
is the number of detected respiratory peaks in a window of duration
s. RR is expressed in breaths per minute (breaths/min).
2.4. Quantitative Analysis
A quantitative analysis was carried out using the following three performance metrics:
Mean Absolute Error (MAE)
Mean Absolute Percentage Error (MAPE)
Root Mean Square Error (RMSE)
where and represent the reference and estimated RRs, respectively, and N is the number of observations.
Additionally, the Pearson correlation coefficient () was computed to assess the linear relationship between derived and reference RR. Interpretation of typically follows these conventional thresholds: values below 0.3 indicate a weak correlation, values between 0.3 and 0.7 suggest a moderate correlation, and values above 0.7 are considered indicative of a strong correlation. This statistical measure complements the error-based metrics by capturing the degree of trend agreement between signals.
The analysis was conducted on two levels, as follows: intra-subject analysis, where for each subject, the RR values were computed across all time windows (e.g., for subject 1, we considered all 15 windows), allowing us to assess how each subject’s respiratory rate varied over time; and inter-subject analysis, where for each time window, the RR values were compared across all subjects (e.g., for time window 1, we considered the RR of all subjects), enabling us to evaluate variability between subjects. This dual approach allowed us to capture both intra-subject variability over time and differences in estimation performance across subjects. The evaluation metrics (MAE, RMSE, MAPE, and ) were calculated for the overall values, obtained as the average of the intra-subject means followed by the average of the inter-subject means.
Furthermore, to assess and compare the accuracy and reliability of the two measurements, the Wilcoxon signed-rank test was performed on the within and between analysis by comparing the average derived and reference RRs.
Moreover, Bland–Altman (BA) analysis in and between subjects was used to visually assess the agreement between the two measurements. In the graphical representation of BA analysis, the mean of the two measurements is plotted on the x-axis, while the difference between the measurements is plotted on the y-axis. To obtain the limits of agreement, defined as the mean of the differences ±1.96 times the standard deviation (i.e., 95% of the confidence interval), the mean of differences (bias) and the standard deviation of differences were calculated. This method allows for evaluating errors, such as the distance of bias from zero. The overall statistical significance was set to .
3. Results
In this section, we present both qualitative and quantitative evaluations of the ECG- and PPG-estimated signals compared to the reference signals.
Figure 6 illustrates the continuous EDR signal reconstructed using cubic spline interpolation, applied to QRS morphological features. To simplify the comparison between the reference and estimated signals, the amplitudes were normalized, with all plots displayed between 0 and 1. Specifically, the R-peak, QRS slope (both up-slope and down-slope), and QRS area were extracted, with each feature generating a separate EDR signal that reflects the respiratory variations associated with that particular morphological feature. Similarly,
Figure 7 shows the PDR signals derived from the AM and FM components of the PPG signal. Both features were analyzed, and cubic spline interpolation was used to reconstruct a continuous respiration signal derived from the PPG.
Table 1 shows the performance metrics for the ECG and PPG signals for both datasets. The results clearly show that ECG signals outperform PPG signals in terms of accuracy of breath rate estimation, as reflected in consistently lower MAE, MAPE, and RMSE values for ECG-based methods. Specifically, ECG-based methods (such as the R peak, QRS area, and QRS slope) showed lower and more consistent errors in both datasets. In the iAMwell dataset, MAE ranged from 0.99 to 1.04 breaths/min, while in the Capnobase dataset, it increased to between 3.07 and 3.74 breaths/min, still within a reasonably low error range. In contrast, PPG-based methods (FM and AM) showed significantly higher MAE values, ranging between 5.10 and 5.12 breaths/min in iAMwell and increasing substantially in Capnobase (10.66 and 13.90 breaths/min), indicating poorer performance. Looking at MAPE, the ECG methods maintained relatively low errors (9.45–9.94% in iAMwell and around 30.78% in Capnobase), while PPG showed substantially higher error rates: 26.65–27.17% in iAMwell and more than 100% in Capnobase, reflecting a much less reliable estimation. The RMSE values follow the same trend—the ECG resulted in 2.68–2.86 pbm in iAMwell and 11.71–13.46 breaths/min in Capnobase, while the PPG results were consistently worse in both datasets.
Regarding correlation, a moderate and statistically significant relationship was observed for ECG-based methods in both data sets. Specifically, for the R-peak–based method, the correlation coefficient was in the iAMwell dataset (p-value = 0.010) and in the Capnobase dataset (p-value = 0.015). For the QRS area method, the correlation was in iAMwell and in Capnobase (p-values = 0.021 and 0.016, respectively). The up-slope method yielded in iAMwell (p-value = 0.024), while the down-slope method showed a lower correlation of (p-value = 0.036). In Capnobase, the QRS slopes showed a significant correlation (p-value = 0.029, and p-value = 0.038, respectively), but with respect to < 0.50. In contrast, PPG-based methods demonstrated weak correlations in both datasets, with all values below 0.5, indicating a limited association between PPG-derived estimates and the reference respiratory rate.
Table 2 and
Table 3 present the within-subject comparison between the derived and reference average RR values for both datasets, respectively.
Table 4 and
Table 5 present the between-subject comparison between the derived and reference average RR values for each dataset, respectively.
To assess the agreement between the estimated and reference RR, the Bland–Altman bias and limits of agreement for each method, for both ECG and PPG signals, and for both datasets (iAMwell and Capnobase), are summarized in
Table 6.
It is worth noting that ECG-based methods tend to underestimate the RR, whereas PPG-based methods generally overestimate it. Among all evaluated techniques, the R-peak–based method demonstrated the most stable and accurate performance, with consistently low bias values and performance metrics in both within-subject and between-subject analyses, and across the iAMwell and Capnobase datasets. Given its robustness, low systematic error, and consistent performance, the BA plots are reported only for the R-peak–based method, as a representative example of the best-performing approach.
For the iAMwell dataset, the BA analysis based on the agreement within subjects is shown in
Figure 8a. In
Figure 8b, the bias for each subject is reported.
For the Capnobase dataset, the BA analysis based on the agreement within subjects is shown in
Figure 9a. In
Figure 9b, the bias for each subject is reported.
For the iAMwell dataset, the BA analysis based on the agreement between subjects, which shows the average RR (derived and reference), is shown in
Figure 10a. In
Figure 10b, the bias for each window is reported.
For the Capnobase dataset, the BA analysis based on the agreement between subjects, which shows the average RR (derived and reference), is shown in
Figure 11a. In
Figure 11b, the bias for each window is reported.
4. Discussion
Numerous methods have been proposed in the literature for extracting EDR and PDR signals, aiming to enable automated and non-invasive monitoring of RR in both clinical and everyday settings. These techniques are mainly divided into two categories, namely, filter-based and feature-based approaches [
19,
25]. Filter-based methods apply band-pass filters directly to isolate respiratory frequency components [
48,
49], while feature-based approaches extract beat-by-beat characteristics (e.g., QRS area and duration, peak amplitude, PPG pulse width) that reflect respiratory modulation [
30,
50,
51,
52]. In 2016, Charlton et al. [
25] conducted a comprehensive comparison of various algorithms for estimating RR from ECG and PPG signals, under both ideal and real clinical conditions. Their findings showed that algorithms generally performed better when applied to ECG rather than PPG signals.
The present work lies in the proposal of a methodological pipeline for RR estimation, based on the extraction of morphological features from ECG and PPG signals, aimed at identifying the most accurate and robust features. Although we focus on a single algorithm and a limited set of six features, the proposed method is simple to implement, computationally efficient, and relies on key fiducial points commonly used for both ECG and PPG. To validate the pipeline, two publicly available datasets, which include both healthy and pathological subjects, were used, allowing us to assess performance across different populations. Additionally, RR estimation was evaluated across all time windows of the signals, providing a realistic measure of performance across varying signal quality. In line with previous literature [
36,
37], results demonstrated that a 30-s time window yielded the highest accuracy for RR estimation. By contrast, although Charlton et al. [
25] also used 32-s windows, they excluded segments with poor signal quality, meaning their reported performance is based only on clean, high-quality windows. Our approach, therefore, offers a more representative evaluation under real-world conditions.
Based on the obtained results, it is possible to observe that the derived signal and the reference signal oscillate with a similar pattern, especially in the case of ECG-derived signals, which allowed for the reconstruction of the EDR signal with good performance. In contrast, while the signal derived from the PPG also displayed a similar visual pattern, it exhibited a higher oscillatory rate. To further detail, we performed a quantitative evaluation by calculating error metrics in order to provide a more accurate assessment of each method (
Table 1). Indeed, the ECG-features methods showed the lowest values compared to the PPG-features methods in both datasets. In accordance with previous findings reported in the literature [
25], the findings of the present study suggest that the algorithms for RR estimation based on ECG features outperform those relying on PPG characteristics. The ECG features are intentionally simple and interpretable, mainly based on amplitudes and slopes around the R-wave. This simplicity offers advantages, especially in terms of computational load and physiological interpretability, thereby facilitating simpler real-time applications. While our study did not directly explore the physiological mechanisms underlying these features, previous research has linked QRS slopes to cardiac ischemia biomarkers and identified variations in R-wave amplitude as relevant in contexts such as sleep apnea detection [
42]. These results are further supported by the statistical analyses performed. From the Wilcoxon signed rank test within the within-subject analysis, the derived breathing rates from the ECG-derived features are very close to the reference values, with few statistically significant differences (
Table 2 and
Table 3). A similar trend was observed in the between-subjects analysis (
Table 4 and
Table 5). In contrast, PPG-derived features exhibited a higher number of statistically significant differences, suggesting greater variability and reduced consistency in estimating RR both within and between subjects (
Table 4 and
Table 5).
Additionally, an agreement analysis was performed to evaluate the consistency between estimated and reference respiratory signals, offering a holistic view of each method’s reliability. ECG-derived features (R-peak, QRS area, up-slope, down-slope) exhibit relatively small bias values across both datasets and conditions, with most values close to or below ±1 breaths/min. Among the ECG-features, the R-peak showed the lowest error rates, confirming its superior reliability for RR estimation. In particular, the RR predicted by the proposed algorithms showed promising results, aligning well with the reference respiratory signals. The BA plots based on inter-subject agreement show a mean difference (bias) of −0.97 breaths/min in iAMwell and −1.16 breaths/min in Capnobase, indicating a systematic underestimation by the derived method. All data points fall within the limits of agreement, suggesting good consistency between the two methods. No apparent trend is observed between the differences and the mean values. Similarly, the intra-subject BA analysis reveals a bias of −0.84 breaths/min in iAMwell and −1.22 breaths/min in the Capnobase, again indicating systematic underestimation. In this case, all data points also lie within the agreement limits, further confirming the reliability of the derived method. In contrast, the PPG-derived features (FM and AM) show greater variability. While AM and FM components on the iAMwell dataset present moderately higher bias. In particular, the AM values remain relatively stable between intra- and inter-subject evaluations. Notably, the Capnobase dataset reveals more pronounced fluctuations—FM exhibits a high positive bias in both the intra- and inter-subject analyses (13.45 and 19.83 breaths/min, respectively). Additionally, the limits of agreement are wider in the inter-subject analysis. This may be attributed to the greater variability among subjects, as also suggested by Pimentel et al. [
53]. Consistent with this trend, it is worth noting that ECG-based methods tend to underestimate the RR, whereas PPG-based methods generally overestimate it.
Although PPG is widely used in wearable devices due to its ease of acquisition and non-invasiveness, respiration signals derived from PPG showed lower performance compared to those obtained from ECG. This can be attributed to PPG’s greater sensitivity to motion artifacts, especially during physical activity, but also to vascular and tissue changes that may occur even in the absence of movement, such as in clinically ill patients. Factors like vasoconstriction, hypothermia, a deep gasp, or a yawn could degrade the signal and impact the accuracy of PPG-derived respiratory rate estimates [
54,
55,
56]. Considering our results, the respiratory signal in the Capnobase dataset is of the capnographic type, meaning it is measured based on CO
2 oscillations. This signal is sensitive to changes in pulmonary blood flow distribution and perfusion, and it may be less reliable than thoracic respiratory measurements. In contrast, the iAMwell dataset captures respiratory activity through variations in abdominal and thoracic circumference using inductive sensors. Considering that AM in the PPG signal is primarily driven by intrathoracic pressure changes [
19] and not directly influenced by ventilation, we hypothesize that the PPG signal may correlate more strongly with the inductive respiratory signal from the iAMwell dataset, based on thoracic and abdominal circumferences, than with the capnogram from the Capnobase dataset, which reflects CO
2 oscillations. This could explain the observed differences in performance between the two datasets.
Considering FM, as mentioned in the introduction, it originates from RSA, a physiological response in which respiration induces changes in intrathoracic pressure. For example, inhalation induces an increase that stretches the sinoatrial node and increases heart rate. Since ECG directly captures the heart’s electrical activity, it provides a more direct and accurate measure of RSA-induced FM compared to PPG.
Finally, it is worth noting that the findings from the iAMwell dataset showed that the acceleration phase of the exercise protocol revealed a clear increase in RR, consistent with the expected physiological response to exercise. Starting from a baseline of 19 breaths/min, RR gradually increased as treadmill speed accelerated, reaching up to 22 breaths/min. This trend reflects the body’s natural adjustment to physical exertion, confirming that the algorithm effectively captures the expected changes in breathing rate during exercise, making it a reliable tool for monitoring respiratory responses to varying intensities.
While only a limited number of studies have simultaneously leveraged both ECG and PPG signals [
26,
27,
28,
29,
30], the majority of existing research has primarily focused on the technical development of filters and signal reconstruction methods [
31,
32,
33], rather than on analyzing how different features extracted from source signals (i.e., ECG or PPG) impact the accuracy of RR estimation. As a result, direct comparison with existing literature is not straightforward.
A key strength of this study is the use of two complementary datasets, including both healthy and pathological subjects, which allows a more robust and clinically relevant evaluation of the proposed RR estimation methods. As a future development, working with more customized and clinically diverse datasets is still recommended, as this would allow for a more comprehensive validation and further optimization of the proposed approach. Future work will also explore the fusion of features from both ECG and PPG signals to improve estimation accuracy. Moreover, the influence of various technical (e.g., sampling frequency, measurement site, protocol), clinical (e.g., diseases, comorbidities), and socio-demographic (e.g., age, gender, body max index) factors on signal quality and reliability will be thoroughly investigated. Such an approach could ultimately offer a low-cost and efficient tool for early screening of potential pathologies or serve as a set of biomarkers for unhealthy lifestyle detection.