Comparative Evaluation of Deep Learning Models for Respiratory Rate Estimation Using PPG-Derived Numerical Features

Hasan, Syed Mahedi; Raj, Mercy Golda Sam; Mitra, Kunal

doi:10.3390/electronics15051108

Open AccessArticle

Comparative Evaluation of Deep Learning Models for Respiratory Rate Estimation Using PPG-Derived Numerical Features

by

Syed Mahedi Hasan

,

Mercy Golda Sam Raj

and

Kunal Mitra

^*

Biomedical Engineering and Sciences Department, Florida Tech, 150 W University Blvd, Melbourne, FL 32901, USA

^*

Author to whom correspondence should be addressed.

Electronics 2026, 15(5), 1108; https://doi.org/10.3390/electronics15051108

Submission received: 12 February 2026 / Revised: 4 March 2026 / Accepted: 4 March 2026 / Published: 7 March 2026

(This article belongs to the Topic Bio-Inspired, Biomedical, Surgical, Social and AI-Integrated Bio-Mechanical Robotics)

Download

Browse Figures

Versions Notes

Abstract

Respiratory rate (RR) is a critical vital sign for the early detection of hypoxia and respiratory deterioration, yet its continuous monitoring remains challenging in clinical environments. Photoplethysmography (PPG) provides a non-invasive source of physiological information from which respiratory dynamics can be inferred. In this study, numerical physiological features derived from PPG data were used to comparatively evaluate multiple deep learning models for respiratory rate estimation. Fixed-length sliding windows were constructed from the dataset and used to train five neural network architectures: a Deep Feedforward Neural Network (DFNN), unidirectional and bidirectional Recurrent Neural Networks (RNN, Bi-RNN), and unidirectional and bidirectional Long Short-Term Memory networks (LSTM, Bi-LSTM). Model performance was assessed using mean absolute error (MAE), root mean squared error (RMSE), coefficient of determination (R²), and computational runtime. Results indicate that models incorporating temporal dependencies outperform the static feedforward baseline, achieving MAE values as low as 0.521 breaths/min, making them competitive with or lower than previously reported PPG-based approaches. These findings highlight the effectiveness of temporal deep learning models for respiratory rate estimation from PPG-derived numerical features and provide insight into accuracy–efficiency trade-offs relevant to real-time monitoring applications.

Keywords:

respiratory rate estimation; photoplethysmography; vital sign monitoring; deep learning models; recurrent neural networks

1. Introduction

Hypoxia, one of the most common physiological stressors, leads to diminished brain oxygenation, posing substantial risks in the military aviation sectors, interfering with personnel performance and resulting in blackouts and death. Hypoxia is produced by a lack of oxygen availability, which reduces peripheral blood oxygen concentration in the circulating blood. These physiological changes can cause impaired cognition (decision-making), weariness, visual problems, headaches, and poor motor function, all of which are critical components of aviation performance. During hypoxia, peripheral oxygen saturation (SpO₂) goes below 90%, resulting in decreased brain perfusion, and is often accompanied by a change in breathing pattern [1]. Hence, alterations in Respiratory Rate (RR) frequently serve as an early and significant predictor that occurs before any significant life-threatening clinical symptoms appear; however, it can be difficult to assess directly. As a result, researchers have progressively explored indirect techniques to determine RR from physiological signals such as electrocardiogram (ECG), photoplethysmogram (PPG), impedance pneumography, and capnography [2,3], each capturing different respiratory influences like electrical heart rhythm changes, blood volume fluctuations, chest wall movement, or exhaled CO₂ concentration. Among these physiological signals, PPG has emerged as a promising signal due to its ability to reflect respiratory activity through subtle variations in the waveform [4,5]. PPG sensors offer a practical solution for long-term RR monitoring due to their simplicity, comfort, and widespread adoption in consumer devices as well as in certain clinical settings, including the ICU [6,7,8].

Due to its respiratory sensitivity, the PPG signal demonstrates how breathing induces specific, measurable modulations in the waveform through the interaction of the respiratory and cardiovascular systems [2]. These modulations, known as Respiratory-Induced Variations (RIVs), provide a physiological basis for extracting respiratory information. There are three primary ways respiration modulates the PPG signal: Amplitude Modulation (RIAV), where the physical act of breathing changes the pressure within the chest, altering cardiac output and causing the amplitude of the PPG pulse to vary in sync with the breathing cycle; Frequency Modulation (RIFV), caused by Respiratory Sinus Arrhythmia (RSA), where the heart rate increases during inspiration and decreases during expiration, directly reflecting the time intervals between PPG pulses [9]; and respiratory-induced intensity variation (RIIV), where changes in chest pressure affect the baseline blood volume in the tissue, causing the entire baseline of the PPG signal to slowly drift with each breath [4]. Because the PPG signal contains these three distinct sources of respiratory information, it is a viable indicator for estimating RR [5].

PPG is often used in combination with other physiological signals such as ECG to improve respiratory rate (RR) estimation, particularly in conditions with motion artifacts or signal noise. Studies have demonstrated that integrating PPG and ECG signals through statistical methods and advanced fusion techniques can enhance accuracy and robustness in complex physiological or environmental settings [4,10,11]. In this study, we utilize the Beth Israel Deaconess Medical Centre (BIDMC) dataset, which contains numerical physiological features derived from both PPG and ECG signals [4,12]. Using these numerical values instead of raw waveforms allows faster processing, lowers computational cost, and aligns with the pattern by which vital signs are typically stored and accessed in clinical workflows. While other sensing modalities such as impedance pneumography, radar, accelerometry, and thermal imaging have been explored for RR estimation, PPG remains the most practical and widely adopted method for continuous monitoring [13,14,15]. Furthermore, in addition to RR estimation, research utilizing signal processing and ML techniques has shown that PPG may be utilized to determine heart rate, even under challenging circumstances, such as low sampling rates or intense physical activity [16,17]. This capability to capture multiple physiological parameters reduces the reliance on multiple sensing modalities for vital sign monitoring.

With the increasing data availability from bio signals, novel Artificial Intelligence (AI)-, ML-, and Deep Learning (DL)-based regression models are also employed to understand such complex relations between brain oxygenation and physiological response like heart rate, pulse rate and other vital signs to predict RR [4,18,19]. DL methods to predict RR can be categorized into two broad classes: one employs a single type of DL method, and the other uses a combination of multiple types. Commonly used approaches include Convolutional Neural Networks (CNNs), Deep Feedforward Neural Networks (DFNNs), and Recurrent Neural Networks (RNNs) along with their specialized variants. CNNs have emerged as the most reliable tools to predict RR and extract breathing pattern features from various data sources in real-world conditions, including signal images, smartwatches, videos of people breathing, etc. [6,20,21]. CNNs are particularly effective at finding the local features, whereas DFNNs are more suited when the data are entirely numeric, such as signals or macroscopic physiological parameters [22]. Researchers have demonstrated that DFNNs can successfully identify the respiratory patterns from ECG and PPG signals and can be employed to determine RR with high accuracy when analyzed over short timeframes [23,24]. Additionally, special RNNs like Long Short-Term Memory (LSTM) can preserve memory of past inputs, enabling them to manage variable-length sequences and excel in tasks like speech recognition, language translation, and time series prediction [25,26]. Given their strengths in modeling time series data, RNNs and LSTMs have been applied to both bio-signals and video data for accurate RR prediction [23,24,27].

Previous related studies have attempted to combine multiple DL methods to estimate RR. Combination CNN-LSTM architectures have been applied for RR estimation from PPG signals and have demonstrated improved performance compared to standalone convolutional networks and classical signal-processing approaches in reported studies [6]. RR and other vital signs have been estimated using advanced deep learning architectures such as Residual Neural Networks (ResNet) and Residual Recursive Wavelet Neural Networks (RRWaveNet), which are typically implemented as independent convolutional-based models incorporating residual connections [28]. These models achieved high accuracy with low mean absolute errors as low as 1.23 to 2.5 [5,29,30]. A new residual-squeeze-excitation-attention-based convolutional network (Res-SE-ConvNet) and a cycle generative adversarial network (CycleGAN) have also been used to extract ambulatory blood pressure (ABP) from PPG signals with twice the accuracy [31,32,33]. It is also proposed that dual branch networks that consist of an attention-based convolutional neural network (CNN), which provides additional importance to specific components, Transformer network (ACTNet), and LSTM variants like Bi-LSTM can predict respiratory rate from ECG, PPG, and surface electromyogram (sEMG) bio-signals [23,34].

Recent advances in wearable and implantable bioelectronic systems have enabled continuous acquisition of PPG-derived physiological parameters for real-time health monitoring [35]. Recent deep learning-based respiratory rate estimation approaches from PPG further demonstrate the feasibility of data-driven modeling in clinical datasets [6], while energy-efficient neural architectures have been proposed to support embedded implementations [36]. Hybrid modeling frameworks integrating artificial intelligence with mechanistic physiological models have been proposed to enhance interpretability and robustness in biomedical signal analysis [37]. Recent advances in wearable bioelectronic systems have expanded the role of PPG-based monitoring within intelligent physiological sensing platforms. Flexible optical sensors integrated with embedded machine learning now enable continuous, low-power monitoring of cardiovascular and respiratory parameters under real-world tissue-device interaction conditions. Deep learning approaches applied to PPG signals have demonstrated clinical feasibility for vital sign estimation within such smart monitoring ecosystems [33]. In this context, the present work aligns with data-driven bioelectronic monitoring paradigms by evaluating computationally efficient neural architectures suitable for near real-time deployment. While such approaches incorporate physical constraints into learning systems, the present study focuses on benchmarking purely data-driven neural architectures applied to clinically derived numerical parameters under controlled experimental conditions. Most approaches demonstrated considerable accuracy, and they relied on either ECG, PPG or other bio-signals or images to estimate RR. The constraints of the previously discussed approaches are that they require a computationally intensive architecture and have a strong dependence on the time-window, e.g., smaller time-windows showed higher accuracy than longer time-windows [6,16]. In contrast, the present work utilizes macroscopic physiological parameters such as heart rate, pulse rate, SpO₂ and age as input features to reduce computational complexity. To capture short-term temporal dynamics without modeling longer time windows, the numerical data were organized into fixed-length temporal windows, allowing the models to learn local sequential patterns in a controlled and consistent manner. Absolute time was not included as an input feature, and the windowed samples were randomly shuffled to prevent the learning of spurious temporal ordering. As a result, the proposed models focus on short-term physiological variations rather than long-term temporal trends. This formulation makes the models well suited for near-real-time respiratory rate estimation based on recent physiological history. The primary contribution of this study lies in a controlled architecture-level comparison of static and sequential deep learning models under identical dataset, feature, and evaluation conditions. By isolating architectural design while maintaining consistent preprocessing and training strategies, this study quantifies how temporal modeling capacity and bidirectionality influence respiratory rate estimation performance and computational runtime. This controlled approach provides practical insights into efficient real-time physiological monitoring systems.

2. Methods

2.1. Dataset and Feature Preparation

The Beth Israel Deaconess Medical Centre (BIDMC) PPG and respiration dataset, available through PhysioNet [4,12], was utilized for this study. The dataset contains physiological recordings from 53 adult patients, including synchronized photoplethysmography (PPG) signals, reference respiration measurements, and associated vital signs. The original waveform recordings are sampled at 125 Hz; however, the derived numerical physiological parameters used in this study are sampled at 1 Hz. Measurement accuracy and uncertainty associated with PPG, ECG, and respiratory signals are determined by the original clinical monitoring systems used during data acquisition. These uncertainties are inherent to the dataset and were not modified in the present modeling pipeline. It has been widely employed as a benchmark resource for evaluating respiratory rate (RR) estimation algorithms.

For algorithm development, numerical physiological features were used rather than raw waveform signals. Specifically, heart rate (HR), pulse rate, peripheral oxygen saturation (SpO₂), and age were selected as model inputs. HR was derived from the ECG signal provided in the BIDMC dataset, while pulse rate and SpO₂ were derived from photoplethysmography (PPG) recordings. HR and pulse capture cardiovascular activity influenced by respiration, whereas SpO₂ provides indirect information on oxygen exchange related to breathing. Age was incorporated from the associated patient metadata files, enabling inclusion of demographic variations that may affect respiratory dynamics. These numerical features were already available in the dataset and were directly utilized without additional signal-level feature extraction or waveform processing. All physiological variables and metadata were merged to form a structured numerical dataset suitable for deep learning (DL) analysis. These four physiological markers, HR, pulse rate, SpO₂, and age, were used consistently to predict respiratory rate using different deep learning techniques.

The workflow of this study followed a structured process beginning with data acquisition and preprocessing, model development and optimization, model training and testing, performance evaluation, and final model selection. Five deep learning techniques were employed: DFNN, RNN, Bi-RNN, LSTM, and Bi-LSTM. Before training, the data was scaled using standardization, where the mean value of each feature was subtracted from its data points and the result divided by the standard deviation. Standardization is a widely used preprocessing method in machine learning and deep learning, as it enhances model performance by improving the stability of gradient-based optimization [38]. The data is then separated into a training set with 80% of the data and a test set with the remaining 20% of the data. The training dataset is utilized for model development, while the testing dataset is used for model evaluation. An overview of the workflow for respiratory rate estimation using numerical physiological data is shown in Figure 1.

2.2. Windowing Strategy and Data Partitioning

Raw PPG waveforms were not directly processed in this study. Instead, numerical physiological parameters derived from clinically validated monitoring systems were used. Signal-level preprocessing steps, including filtering and artefact mitigation, were performed during original data acquisition. Consequently, the present work focuses on modeling temporal relationships among physiological variables rather than implementing waveform-level feature extraction pipelines.

To enable learning of temporal dependencies, the continuous numerical data were segmented into fixed-length overlapping windows. The numerical physiological features were sampled at 1 Hz, and a window length of L = 10 corresponds to 10 s of physiological history. For each patient record, sliding windows of length L samples were extracted, where each window comprised sequential measurements of the selected input features. For every input window, the corresponding target value was defined as the respiratory rate measured at the subsequent time step following the window. This formulation frames the task as a short-horizon regression problem using recent physiological history. After window generation, all samples from all patients were pooled, and the resulting windowed dataset was randomly shuffled.

The data were then partitioned at the window level, with 80% of the windows assigned to the training set and the remaining 20% reserved for testing. This random window-level split ensures that both training and testing sets contain diverse physiological patterns while maintaining consistency with prior deep learning studies using similar formulations.

Prior to model training, all input features were standardized using z-score normalization. The mean and standard deviation computed from the training set were used to scale the validation and testing data, ensuring consistent feature distributions across all model evaluations. Standardization improves numerical stability and accelerates convergence during gradient-based optimization [38,39].

Regression is a fundamental supervised learning methodology that is employed in predictive analytics to establish the relationship between dependent variable(s) and independent variable(s) [22]. This relationship can be found by finding the line of best fit that minimizes the error between actual value and predicted value. Regression methods involve data preparation and preprocessing, feature selection, model building and training and evaluation. Upon building the model, the efficacy of the model is assessed by the various performance metrics such as Mean Absolute Error (MAE) and the coefficient of determination

R^{2}

to predict the accuracy of the model. Regression offers computational efficiency and interpretation, which makes it well-suited for the prediction-based studies [38]. In this work, we used the DL models to perform regression on the input features and target variables discussed in Section 2.1.

2.3. Deep Learning Models

Five deep learning architectures were evaluated to estimate respiratory rate from the windowed numerical data: Deep Feedforward Neural Network (DFNN), Recurrent Neural Network (RNN), Bidirectional Recurrent Neural Network (Bi-RNN), Long Short-Term Memory network (LSTM), and Bidirectional LSTM (Bi-LSTM). All models were formulated as regression networks producing a single continuous RR estimate for each input window.

In all evaluated architectures, trainable parameters consist of weight matrices and bias vectors automatically initialized and updated during training. For feedforward layers, inputs are transformed through weighted linear combinations followed by nonlinear activation functions (ReLU in hidden layers and linear activation in the output layer).

In feedforward layers, each hidden layer performs a linear transformation followed by a nonlinear activation:

h = σ (W x + b)

where

W

and

b

denote the trainable weight matrix and bias vector, respectively, and

σ (\cdot)

is the activation function.

For recurrent architectures, the hidden state update at time step

t

is governed by

h_{t} = ϕ (W_{x} x_{t} + W_{h} h_{t - 1} + b)

where

W_{x}

and

W_{h}

represent input and recurrent weight matrices.

For recurrent architectures (RNN and Bi-RNN), temporal dependencies are modeled through trainable input-to-hidden and hidden-to-hidden weight matrices that update hidden states sequentially [40,41].

LSTM layers extend this formulation by introducing gated memory mechanisms to regulate information flow across time steps [42,43].

Model parameters were optimized using the Adam optimizer to minimize mean squared error through backpropagation.

2.3.1. Deep Feedforward Neural Network (DFNN)

Deep Feedforward Neural Network is an NN that comprises input layer, hidden layers and output layer, where the data flows only in forward direction. A sequence of inputs is multiplied by the weight when it enters the layer. DFNN is commonly used for the tasks like classification, regression and signal processing [25,44]. DFNN works by using lower-level features to update higher-level features. DFNN has the architectures with multiple hidden layers and hidden variables. Figure 2 represents the DFNN model [45].

2.3.2. Recurrent Neural Network (RNN)

Recurrent neural networks (RNNs) are a very effective model for sequential data. End-to-end training approaches allow RNNs to be trained for sequence labeling issues with uncertain input–output alignment. RNNs are naturally deep in time, as their hidden state is a function of all prior hidden states [40]. To efficiently absorb data, the depth of the network is increased by adding more recurrent and/or fully connected layers [46]. RNNs use a context layer to store information and propagate it into future states to handle future inputs. The context layer stores the output of state neurons from earlier time steps, making it suitable for time-varying patterns in data [46]. RNNs are beginning to be deployed as sophisticated recognizers [42]. A modified version of an RNN called bidirectional recurrent neural network (Bi-RNN) may be trained with all available input information from the past and future of a certain time frame. This is achieved by training it concurrently in the forward and backward time directions. Bi-RNN structures outperform unidirectional structures as they can access the information from both directions [41]. Figure 3 represents the recurrent neural network structure [47].

2.3.3. Long Short-Term Memory (LSTM)

Compared to RNNs, LSTM networks may learn long-term dependencies and have an increased memory mechanism using gated cell units [43]. This network, as shown in the Figure 4, employs memory cells and gates [48] for much better capabilities in remembering the long-term dependencies in temporal sequences [49]. They consist of input, forget, and output gates. The forget gate excludes less important information. Bi-LSTM is an extension of LSTM that features hidden layers for processing bi-directional information [41]. Bi-directional LSTM is achieved by embedding two independent LSTM models to enable bidirectional flow of information, which takes inputs from both the past and the future. Reversing information allows for the preservation of future states. Combining two hidden states allows the network to store information from both the past and future at any same time [49]. Figure 5 represents the structure of the Bi-LSTM [50].

2.4. Model Training

All models were trained using supervised regression with mean squared error as the loss function. The MSE was selected due to its smooth differentiability and stable gradient properties during optimization. Alternative loss functions were not systematically evaluated in this study, and MAE and RMSE were reported to provide complementary and clinically interpretable measures of prediction accuracy. MAE was reported as the primary performance metric due to its direct clinical interpretability in breaths per minute. Hyperparameter tuning was performed to optimize model performance, including evaluation of optimizer type (Adam and stochastic gradient descent), learning rate, batch size, number of hidden layers, number of neurons per layer, kernel initialization, dropout rate, and L2 regularization strength. Learning rates of 0.1, 0.001, and 0.0001 were explored, while batch sizes ranged from 50 to 500. Dense layer sizes varied between 256 and 1024 neurons for DFNN models, while recurrent layers in RNN and LSTM architectures were configured with up to 1024 units. Hyperparameter optimisation was conducted using structured iterative grid-style exploration across learning rates, batch sizes, and layer configurations. While formal statistical Design of Experiments (DOE) methodologies, such as factorial or response surface designs [51], were not implemented, parameter variation was systematically controlled to ensure consistent comparison among architectures. This approach prioritised stability and reproducibility of training behaviour while maintaining a uniform evaluation framework.

To mitigate overfitting, dropout regularization (0.05–0.20) and L2 weight penalties (1 × 10⁻⁷ to 1 × 10⁻¹⁰) were applied. Model training employed early stopping based on validation loss, with a patience parameter of 100 epochs. Training was terminated when no further improvement in validation performance was observed, and the best-performing model weights were restored for final evaluation.

Sliding windows were generated independently for each subject using fixed-length temporal sequences. Following sequence construction, samples were randomly partitioned into training (80%) and testing (20%) sets at the window level (L = 10). This strategy enables consistent architectural comparison under identical data exposure conditions.

Model performance was assessed using mean absolute error (MAE) and the coefficient of determination (R²), which together quantify absolute prediction accuracy and variance explanation. Computational runtime was also recorded to evaluate the trade-off between predictive performance and computational efficiency. The optimal hyperparameter configurations for each model are summarized in Table 1.

All models were trained using early stopping based on validation loss. The reported configurations correspond to the best-performing hyperparameter combinations identified during tuning.

3. Results

The performance results of the models are compared in Table 2.

3.1. Performance of DFNN Baseline

The Deep Feedforward Neural Network (DFNN) was evaluated as a baseline model using window-level numerical features without explicit temporal recurrence. As shown in Table 2, the DFNN achieved competitive performance (MAE = 0.659 breaths/min, R² = 0.877), but remained inferior to architectures incorporating recurrent temporal modeling. Although DFNN demonstrated stable convergence and the lowest computational runtime among all evaluated models, its performance underscores the limitations of feedforward architectures in capturing temporally evolving physiological processes such as respiration.

3.2. Performance of Recurrent Neural Network Models

Recurrent neural network-based architectures generally outperformed the DFNN baseline. The unidirectional RNN demonstrated the benefit of incorporating sequential dependencies within the input windows, although its predictive performance remained slightly inferior to the DFNN in this dataset. The bidirectional RNN (Bi-RNN), however, showed improved accuracy compared with both DFNN and unidirectional RNN (MAE = 0.668 breaths/min, R² = 0.881), indicating that bidirectional temporal modeling enhances the extraction of short-term physiological dynamics relevant to respiratory rate estimation.

3.3. Performance of LSTM-Based Architectures

LSTM-based models achieved the strongest overall performance among the evaluated architectures. Compared with standard RNNs, LSTM variants exhibited lower prediction errors and improved robustness, reflecting their ability to capture temporal dependencies while mitigating vanishing gradient effects. The unidirectional LSTM achieved the lowest mean absolute error (MAE = 0.521 breaths/min) and demonstrated a favorable balance between predictive accuracy and computational efficiency.

The bidirectional LSTM (Bi-LSTM) achieved the highest coefficient of determination (R² = 0.906) and the lowest RMSE (1.031 breaths/min), as reported in Table 2. However, this improvement was accompanied by a substantial increase in computational time (23,040 s), significantly exceeding that of the unidirectional LSTM and other architectures. These findings indicate that while bidirectional temporal modeling can enhance statistical performance, it introduces considerable computational cost. Overall, the LSTM architecture provides the most balanced trade-off between accuracy and efficiency for near-real-time respiratory rate estimation.

Runtime includes total training and inference time measured on the same computational platform for each model.

3.4. Comparative Analysis and Runtime Evaluation

A comparative summary of all evaluated models is presented in Table 2. Overall, models incorporating temporal recurrence substantially outperformed the static DFNN baseline, underscoring the importance of sequential modeling for respiratory rate estimation. While bidirectional architecture provided the highest accuracy, unidirectional recurrent models offered competitive performance with lower computational overhead.

Runtime analysis revealed that feedforward and unidirectional recurrent models exhibited faster training and inference times, making them more suitable for real-time or near-real-time monitoring applications. In contrast, bidirectional models, although more accurate, required additional computational resources. These findings highlight a clear trade-off between prediction accuracy and computational efficiency that must be considered when selecting models for practical deployment.

Reported runtime measurements reflect execution time within the experimental computing environment used for model development and are intended for relative architectural comparison. Deployment-level latency on embedded or edge hardware platforms was not directly benchmarked in this study. Dedicated hardware-level evaluation will be required to quantify real-time inference latency and energy efficiency in practical implementations.

Figure 6 illustrates the prediction performance of the evaluated deep learning models by comparing the predicted and true respiratory rate values using fixed-length windowed inputs. Across all models, a strong linear relationship is observed between predicted and reference measurements, as indicated by the clustering of data points around the identity line. The DFNN baseline exhibits greater dispersion, particularly at lower respiratory rate values, reflecting the limitations of static architectures in capturing temporal variability. Recurrent models demonstrate progressively improved alignment with the identity line, highlighting the benefit of incorporating sequential dependencies. LSTM-based architectures show the tightest clustering and the highest coefficients of determination, indicating enhanced stability and accuracy in respiratory rate estimation. The bidirectional LSTM model achieves the strongest agreement overall, consistent with its superior quantitative performance reported in Table 2, albeit with increased computational cost.

4. Discussion

This study comparatively evaluated multiple deep learning architectures for respiratory rate estimation using numerical physiological features derived from PPG data. The results demonstrate a clear performance progression as model capacity for temporal modeling increases, highlighting the importance of sequential information for capturing respiratory dynamics.

The DFNN baseline achieved reasonable predictive accuracy but exhibited greater dispersion in predicted values, particularly at lower respiratory rates, reflecting the limitations of static architectures. In contrast, recurrent architectures reduced mean absolute error by up to 21% compared to the feedforward baseline (0.659 to 0.521 breaths/min), demonstrating the importance of temporal modeling for respiratory dynamics. The bidirectional RNN further enhanced performance, indicating that access to both forward and backward temporal context can be beneficial for short physiological sequences.

LSTM-based architecture achieved the strongest overall performance, with reduced prediction error and improved agreement between predicted and reference respiratory rates. The unidirectional LSTM provided a favorable balance between accuracy and computational efficiency, whereas the bidirectional LSTM achieved the highest agreement at the cost of substantially increased runtime. This trade-off underscores the importance of considering computational constraints when selecting models for real-time or resource-limited applications.

Statistical significance testing between model performances was not conducted; reported differences reflect point-estimate comparisons, and future work will incorporate confidence interval estimation and paired statistical testing. Direct comparison with prior studies should be interpreted cautiously due to differences in datasets, preprocessing strategies, and evaluation protocols.

In addition to MAE, RMSE, and R², agreement analysis was performed using Bland-Altman methodology to quantify systematic bias and limits of agreement between predicted and reference respiratory rates. While these statistical measures provide insight into prediction accuracy and variability, predefined clinical acceptability thresholds were not explicitly enforced in this study. Future work will incorporate clinically defined error margins and subgroup-specific evaluation to align model assessment more closely with bedside monitoring requirements. The Figure 7 Bland–Altman plots demonstrate minimal mean bias for all architectures, indicating the absence of systematic over or underestimation. DFNN and RNN models exhibit wider limits of agreement, reflecting greater variability in prediction errors. In contrast, LSTM-based models show narrower limits of agreement and tighter clustering of differences around zero, indicating improved stability and consistency. The bidirectional LSTM demonstrates the narrowest spread of differences, supporting its superior agreement with reference measurements and reinforcing the quantitative findings reported in Table 2.

It should be noted that the evaluation was conducted using random window-level splitting rather than patient-wise partitioning. Consequently, the reported results reflect performance across overlapping physiological windows rather than generalization to unseen subjects. Future work should investigate patient-wise evaluation strategies and incorporate additional physiological features to further assess model robustness.

Overall, the findings confirm that deep learning models leveraging temporal structure are effective for respiratory rate estimation from PPG-derived numerical features.

The evaluation was performed using random window-level partitioning rather than strict patient-wise separation. Consequently, temporal segments from the same subject may appear in both training and testing subsets, which may lead to optimistic performance estimates due to shared physiological patterns. Therefore, the reported results reflect architectural comparison under identical data exposure conditions rather than confirmed generalization to unseen patients. Future work will implement subject-wise cross-validation.

This study focused on comparative analysis among deep learning architectures under identical feature conditions. Classical signal-processing approaches, such as spectral estimation or modulation-based methods commonly used for RR extraction from ECG and PPG signals, were not included as baseline models. The primary objective was to isolate the impact of temporal modeling capacity and bidirectionality within neural architectures. Incorporating traditional signal-processing baselines would provide additional comparative context and represents an important direction for future research [52].

Interpretability analysis such as SHAP or permutation-based feature importance was not performed in this study. The focus was on comparative architectural evaluation rather than feature attribution. Future work will incorporate explainability techniques to quantify feature contributions.

The dataset was used as provided, without explicit stratification according to motion artefacts, signal noise levels, or respiratory pathology subgroups. Although the ICU recordings reflect realistic clinical variability, robustness under controlled perturbations was not separately quantified. Future investigations will incorporate artefact-aware segmentation, synthetic noise augmentation, and subgroup analyses to evaluate performance stability under challenging physiological conditions.

Unlike prior studies that typically propose a single modeling framework or focus on waveform-level end-to-end pipelines, this work systematically evaluates multiple neural architectures using clinically derived numerical parameters under a unified experimental framework. This design enables clearer attribution of performance differences to architectural structure rather than to variations in preprocessing or feature extraction pipelines.

5. Conclusions

This study demonstrated that incorporating temporal modeling improves respiratory rate estimation from PPG-derived numerical features. Compared to the feedforward baseline (MAE = 0.659 breaths/min), the unidirectional LSTM reduced mean absolute error by 21% (MAE = 0.521 breaths/min), while the bidirectional LSTM achieved the highest coefficient of determination (R² = 0.906) with increased computational time. These results indicate that sequential architectures better capture short-term physiological dependencies than static feedforward models.

Differences in runtime across architectures highlight the need to consider both predictive performance and computational cost when selecting models for practical implementation. The evaluation was conducted using window-level partitioning rather than strict patient-wise separation; therefore, future work will incorporate subject-wise validation to assess generalization to unseen individuals. Additional studies will include comparison with classical signal-processing methods and robustness analysis under varying physiological conditions.

Overall, this work provides a structured architectural comparison framework and quantitative benchmark results for respiratory rate estimation using PPG-derived numerical data.

Author Contributions

Conceptualization, S.M.H. and K.M.; methodology, S.M.H. and M.G.S.R.; analysis, S.M.H. and M.G.S.R.; writing, S.M.H., M.G.S.R. and K.M.; supervision K.M. All authors have read and agreed to the published version of the manuscript.

Funding

The authors declare that no funds, grants, or other support were received during the preparation of this manuscript.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data of our study is available upon request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Addison, P.S.; Smit, P.; Jacquel, D.; Borg, U.R. Continuous Respiratory Rate Monitoring during an Acute Hypoxic Challenge Using a Depth Sensing Camera. J. Clin. Monit. Comput. 2020, 34, 1025–1033. [Google Scholar] [CrossRef] [PubMed]
Charlton, P.H.; Birrenkott, D.A.; Bonnici, T.; Pimentel, M.A.F.; Johnson, A.E.W.; Alastruey, J.; Tarassenko, L.; Watkinson, P.J.; Beale, R.; Clifton, D.A. Breathing Rate Estimation from the Electrocardiogram and Photoplethysmogram: A Review. IEEE Rev. Biomed. Eng. 2018, 11, 2–20. [Google Scholar] [CrossRef] [PubMed]
Bawua, L.K.; Miaskowski, C.; Hu, X.; Rodway, G.W.; Pelter, M.M. A Review of the Literature on the Accuracy, Strengths, and Limitations of Visual, Thoracic Impedance, and Electrocardiographic Methods Used to Measure Respiratory Rate in Hospitalized Patients. Ann. Noninvasive Electrocardiol. 2021, 26, e12885. [Google Scholar] [CrossRef] [PubMed]
Pimentel, M.A.F.; Johnson, A.E.W.; Charlton, P.H.; Birrenkott, D.; Watkinson, P.J.; Tarassenko, L.; Clifton, D.A. Toward a Robust Estimation of Respiratory Rate from Pulse Oximeters. IEEE Trans. Biomed. Eng. 2017, 64, 1914–1923. [Google Scholar] [CrossRef]
Bian, D.; Mehta, P.; Selvaraj, N. Respiratory Rate Estimation Using PPG: A Deep Learning Approach. In Proceedings of the 2020 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), Montreal, QC, Canada, 20–24 July 2020; IEEE: New York, NY, USA, 2020; pp. 5948–5952. [Google Scholar]
Chin, W.J.; Kwan, B.-H.; Lim, W.Y.; Tee, Y.K.; Darmaraju, S.; Liu, H.; Goh, C.-H. A Novel Respiratory Rate Estimation Algorithm from Photoplethysmogram Using Deep Learning Model. Diagnostics 2024, 14, 284. [Google Scholar] [CrossRef]
Karlen, W.; Raman, S.; Ansermino, J.M.; Dumont, G.A. Multiparameter Respiratory Rate Estimation from the Photoplethysmogram. IEEE Trans. Biomed. Eng. 2013, 60, 1946–1953. [Google Scholar] [CrossRef]
Addison, P.S.; Watson, J.N.; Mestek, M.L.; Mecca, R.S. Developing an Algorithm for Pulse Oximetry Derived Respiratory Rate (RRoxi): A Healthy Volunteer Study. J. Clin. Monit. Comput. 2012, 26, 45–51. [Google Scholar] [CrossRef]
Xiao, S.; Yang, P.; Liu, L.; Zhang, Z.; Wu, J. Extraction of Respiratory Signals and Respiratory Rates from the Photoplethysmogram. In Body Area Networks. Smart IoT and Big Data for Intelligent Health; Alam, M.M., Hämäläinen, M., Mucchi, L., Niazi, I.K., Le Moullec, Y., Eds.; Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering; Springer International Publishing: Cham, Switzerland, 2020; Volume 330, pp. 184–198. ISBN 978-3-030-64990-6. [Google Scholar]
Baker, S.; Xiang, W.; Atkinson, I. Determining Respiratory Rate from Photoplethysmogram and Electrocardiogram Signals Using Respiratory Quality Indices and Neural Networks. PLoS ONE 2021, 16, e0249843. [Google Scholar] [CrossRef]
Lin, Y.; Song, X.; Zhao, Y.; Zhang, C.; Ding, X. Continuous Respiratory Rate Monitoring through Temporal Fusion of ECG and PPG Signals. PLoS ONE 2025, 20, e0325307. [Google Scholar] [CrossRef]
Goldberger, A.L.; Amaral, L.A.N.; Glass, L.; Hausdorff, J.M.; Ivanov, P.C.; Mark, R.G.; Mietus, J.E.; Moody, G.B.; Peng, C.-K.; Stanley, H.E. PhysioBank, PhysioToolkit, and PhysioNet: Components of a New Research Resource for Complex Physiologic Signals. Circulation 2000, 101, e215–e220. [Google Scholar] [CrossRef]
Hostrup, M.C.F.; Sofie Nielsen, A.; Sørensen, F.E.; Kragballe, J.O.; Østergaard, M.U.; Korsgaard, E.; Schmidt, S.E.; Karbing, D.S. Accelerometer-Based Estimation of Respiratory Rate Using Principal Component Analysis and Autocorrelation. Physiol. Meas. 2025, 46, 035005. [Google Scholar] [CrossRef] [PubMed]
Li, C.; Lubecke, V.M.; Boric-Lubecke, O.; Lin, J. A Review on Recent Advances in Doppler Radar Sensors for Noncontact Healthcare Monitoring. IEEE Trans. Microw. Theory Tech. 2013, 61, 2046–2060. [Google Scholar] [CrossRef]
Takahashi, Y.; Gu, Y.; Nakada, T.; Abe, R.; Nakaguchi, T. Estimation of Respiratory Rate from Thermography Using Respiratory Likelihood Index. Sensors 2021, 21, 4406. [Google Scholar] [CrossRef] [PubMed]
Zhang, Y.; Xu, J.; Xie, M.; Wang, W.; Ye, K.; Wang, J.; Zhu, D. PPG-Based Heart Rate Estimation with Efficient Sensor Sampling and Learning Models. In Proceedings of the 2022 IEEE 24th Int Conf on High Performance Computing & Communications; 8th Int Conf on Data Science & Systems; 20th Int Conf on Smart City; 8th Int Conf on Dependability in Sensor, Cloud & Big Data Systems & Application (HPCC/DSS/SmartCity/DependSys), Chengdu, China, 18–21 December 2022; IEEE: New York, NY, USA, 2023. [Google Scholar]
Zhang, Z.; Pi, Z.; Liu, B. TROIKA: A General Framework for Heart Rate Monitoring Using Wrist-Type Photoplethysmographic Signals During Intensive Physical Exercise. IEEE Trans. Biomed. Eng. 2015, 62, 522–531. [Google Scholar] [CrossRef]
Lampier, L.C.; Coelho, Y.L.; Caldeira, E.M.O.; Bastos-Filho, T. A Deep Learning Approach to Estimate the Respiratory Rate from Photoplethysmogram. Ingenius 2021, 27, 96–104. [Google Scholar] [CrossRef]
Zhang, X.; Zhang, Y.; Si, Y.; Gao, N.; Zhang, H.; Yang, H. A High Altitude Respiration and SpO2 Dataset for Assessing the Human Response to Hypoxia. Sci. Data 2024, 11, 248. [Google Scholar] [CrossRef]
Liaqat, D.; Abdalla, M.; Abed-Esfahani, P.; Gabel, M.; Son, T.; Wu, R.; Gershon, A.; Rudzicz, F.; Lara, E.D. WearBreathing: Real World Respiratory Rate Monitoring Using Smartwatches. In Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies; Association for Computing Machinery: New York, NY, USA, 2019; Volume 3, pp. 1–22. [Google Scholar] [CrossRef]
Hwang, H.; Lee, K.; Lee, E.C. A Real-Time Remote Respiration Measurement Method with Improved Robustness Based on a CNN Model. Appl. Sci. 2022, 12, 11603. [Google Scholar] [CrossRef]
Shah, A.; Shah, M.; Pandya, A.; Sushra, R.; Sushra, R.; Mehta, M.; Patel, K.; Patel, K. A Comprehensive Study on Skin Cancer Detection Using Artificial Neural Network (ANN) and Convolutional Neural Network (CNN). Clin. eHealth 2023, 6, 76–84. [Google Scholar] [CrossRef]
Kumar, A.K.; Ritam, M.; Han, L.; Guo, S.; Chandra, R. Deep Learning for Predicting Respiratory Rate from Biosignals. Comput. Biol. Med. 2022, 144, 105338. [Google Scholar] [CrossRef]
Kwasniewska, A.; Ruminski, J.; Szankin, M. Improving Accuracy of Contactless Respiratory Rate Estimation by Enhancing Thermal Sequences with Deep Neural Networks. Appl. Sci. 2019, 9, 4405. [Google Scholar] [CrossRef]
Dutta, S.; Jha, S.; Sankaranarayanan, S.; Tiwari, A. Output Range Analysis for Deep Feedforward Neural Networks. In NASA Formal Methods; Dutle, A., Muñoz, C., Narkawicz, A., Eds.; Lecture Notes in Computer Science; Springer International Publishing: Cham, Switzerland, 2018; Volume 10811, pp. 121–138. ISBN 978-3-319-77934-8. [Google Scholar]
Schmidhuber, J. Deep Learning in Neural Networks: An Overview. Neural Netw. 2015, 61, 85–117. [Google Scholar] [CrossRef]
Zhao, Q.; Liu, F.; Song, Y.; Fan, X.; Wang, Y.; Yao, Y.; Mao, Q.; Zhao, Z. Predicting Respiratory Rate from Electrocardiogram and Photoplethysmogram Using a Transformer-Based Model. Bioengineering 2023, 10, 1024. [Google Scholar] [CrossRef]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; IEEE: New York, NY, USA, 2016; pp. 770–778. [Google Scholar]
Ravichandran, V.; Murugesan, B.; Balakarthikeyan, V.; Shankaranarayana, S.M.; Ram, K.; Sp, P.; Joseph, J.; Sivaprakasam, M. RespNet: A Deep Learning Model for Extraction of Respiration from Photoplethysmogram. In Proceedings of the 2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Berlin, Germany, 23–27 July 2019; IEEE: New York, NY, USA, 2019. [Google Scholar]
Samavati, T.; Farvardin, M.; Ghaffari, A. Efficient Deep Learning-Based Estimation of the Vital Signs on Smartphones. arXiv 2022, arXiv:2204.08989. [Google Scholar]
Mehrabadi, M.A.; Aqajari, S.A.H.; Zargari, A.H.A.; Dutt, N.; Rahmani, A.M. Novel Blood Pressure Waveform Reconstruction from Photoplethysmography Using Cycle Generative Adversarial Networks. In Proceedings of the 2022 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), Glasgow, UK, 11–15 July 2022; IEEE: New York, NY, USA, 2022; pp. 1906–1909. [Google Scholar]
Osathitporn, P.; Sawadwuthikul, G.; Thuwajit, P.; Ueafuea, K.; Mateepithaktham, T.; Kunaseth, N.; Choksatchawathi, T.; Punyabukkana, P.; Mignot, E.; Wilaiprasitporn, T. RRWaveNet: A Compact End-to-End Multi-Scale Residual CNN for Robust PPG Respiratory Rate Estimation. IEEE Internet Things J. 2023, 10, 15943–15952. [Google Scholar] [CrossRef]
Mahmud, T.I.; Imran, S.A.; Shahnaz, C. Res-SE-ConvNet: A Deep Neural Network for Hypoxemia Severity Prediction for Hospital In-Patients Using Photoplethysmograph Signal. IEEE J. Transl. Eng. Health Med. 2022, 10, 1–9. [Google Scholar] [CrossRef] [PubMed]
Chen, H.; Zhang, X.; Guo, Z.; Ying, N.; Yang, M.; Guo, C. ACTNet: Attention Based CNN and Transformer Network for Respiratory Rate Estimation. Biomed. Signal Process. Control 2024, 96, 106497. [Google Scholar] [CrossRef]
Heikenfeld, J.; Jajack, A.; Rogers, J.; Gutruf, P.; Tian, L.; Pan, T.; Li, R.; Khine, M.; Kim, J.; Wang, J.; et al. Wearable Sensors: Modalities, Challenges, and Prospects. Lab A Chip 2018, 18, 217–248. [Google Scholar] [CrossRef]
Yang, G.; Kang, Y.; Charlton, P.H.; Kyriacou, P.A.; Kim, K.K.; Li, L.; Park, C. Energy-Efficient PPG-Based Respiratory Rate Estimation Using Spiking Neural Networks. Sensors 2024, 24, 3980. [Google Scholar] [CrossRef]
Karniadakis, G.E.; Kevrekidis, I.G.; Lu, L.; Perdikaris, P.; Wang, S.; Yang, L. Physics-Informed Machine Learning. Nat. Rev. Phys. 2021, 3, 422–440. [Google Scholar] [CrossRef]
Huang, X. Predictive Models: Regression, Decision Trees, and Clustering. Appl. Comput. Eng. 2024, 79, 124–133. [Google Scholar] [CrossRef]
Joshi, A.; Guevara, D.; Earles, M. Standardizing and Centralizing Datasets for Efficient Training of Agricultural Deep Learning Models. Plant Phenomics 2023, 5, 0084. [Google Scholar] [CrossRef] [PubMed]
Graves, A.; Mohamed, A.; Hinton, G. Speech Recognition with Deep Recurrent Neural Networks. In Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, Vancouver, BC, Canada, 26–31 May 2013; IEEE: New York, NY, USA, 2013; pp. 6645–6649. [Google Scholar]
Schuster, M.; Paliwal, K.K. Bidirectional Recurrent Neural Networks. IEEE Trans. Signal Process. 1997, 45, 2673–2681. [Google Scholar] [CrossRef]
LeCun, Y.; Bengio, Y.; Hinton, G. Deep Learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
Gupta, T.K.; Raza, K. Optimizing Deep Feedforward Neural Network Architecture: A Tabu Search Based Approach. Neural Process. Lett. 2020, 51, 2855–2870. [Google Scholar] [CrossRef]
Rengasamy, D.; Jafari, M.; Rothwell, B.; Chen, X.; Figueredo, G.P. Deep Learning with Dynamically Weighted Loss Function for Sensor-Based Prognostics and Health Management. Sensors 2020, 20, 723. [Google Scholar] [CrossRef]
Liu, B.; Dai, X.; Gong, H.; Guo, Z.; Liu, N.; Wang, X.; Liu, M. Deep Learning versus Professional Healthcare Equipment: A Fine-Grained Breathing Rate Monitoring Model. Mob. Inf. Syst. 2018, 2018, 5214067. [Google Scholar] [CrossRef]
Pekel, E.; Kara, S. A Comprehensive Review for Artifical Neural Network Application to Public Transportation. Sigma J. Eng. Nat. Sci. 2017, 35, 157–179. [Google Scholar]
Yang, C.; Zhai, J.; Tao, G. Deep Learning for Price Movement Prediction Using Convolutional Neural Network and Long Short-Term Memory. Math. Probl. Eng. 2020, 2020, 2746845. [Google Scholar] [CrossRef]
Chandra, R.; Goyal, S.; Gupta, R. Evaluation of Deep Learning Models for Multi-Step Ahead Time Series Prediction. IEEE Access 2021, 9, 83105–83123. [Google Scholar] [CrossRef]
Su, T.; Sun, H.; Zhu, J.; Wang, S.; Li, Y. BAT: Deep Learning Methods on Network Intrusion Detection Using NSL-KDD Dataset. IEEE Access 2020, 8, 29575–29585. [Google Scholar] [CrossRef]
Laganà, F.; Faccì, A.R. Parametric Optimisation of a Pulmonary Ventilator Using the Taguchi Method. J. Electr. Eng. 2025, 76, 265–274. [Google Scholar] [CrossRef]
Charlton, P.H.; Bonnici, T.; Tarassenko, L.; Clifton, D.A.; Beale, R.; Watkinson, P.J. An Assessment of Algorithms to Estimate Respiratory Rate from the Electrocardiogram and Photoplethysmogram. Physiol. Meas. 2016, 37, 610–626. [Google Scholar] [CrossRef]

Figure 1. Workflow model for prediction of respiration rate.

Figure 2. Deep Feedforward Neural Network structure. Adapted from [45].

Figure 3. Recurrent Neural Network structure. Adapted from [47].

Figure 4. Long Short-Term Memory structure. Adapted from [48].

Figure 5. Bi-directional Long Short-Term Memory structure. Adapted from [50].

Figure 6. Prediction performance of the DL models.

Figure 7. Bland–Altman plots comparing predicted and reference respiratory rates for all evaluated DL models.

Table 1. Best model with tuned hyperparameters.

Model Name	Optimizer (Learning Rate)	No. of Dense Layers (Units)	No. of RNN Layers (Units)	No. of LSTM Layers (Units)	Activation Function	Batch Size	Kernel Initialization	Kernel Regularization	Dropout
DFNN	Adam (0.001)	3 (1024, 512, 512)	–	_	ReLU (Dense), Linear (Output)	200	Mean = 0.0, Std = 0.02	L2 = 1 × 10⁻⁸	0.05
Simple RNN	Adam (0.001)	1 (256)	3 (256, 256, 256)	_	ReLU (Dense &RNN), Linear (Output)	100	Mean = 0.0, Std = 0.02	L2 = 1 × 10⁻⁷	0.05
Bi-Simple RNN	Adam (0.001)	1 (512)	3 (256, 256, 256)	_	ReLU (Dense &RNN), Linear (Output)	200	Mean = 0.0, Std = 0.02	L2 = 1 × 10⁻⁷	0.05
LSTM	Adam (0.001)	1 (512)	_	3 (512, 512, 512)	ReLU (Dense &LSTM), Linear (Output)	200	Mean = 0.0, Std = 0.02	L2 = 1 × 10⁻⁸	0.05
Bi-LSTM	Adam (0.001)	1 (256)	–	3 (1024, 512, 512)	ReLU (Dense &LSTM), Linear (Output)	200	Mean = 0.0, Std = 0.02	L2 = 1 × 10⁻⁸	0.05

Table 2. Results of Deep Learning techniques in numeric data.

Approach	Method	MAE	RMSE	R²	Runtime (s)
Real Data	DFNN	0.659	1.177	0.877	654.85
	RNN	0.691	1.200	0.872	992.61
	Bi-RNN	0.668	1.158	0.881	1198.84
	LSTM	0.521	1.074	0.898	7422.66
	Bi-LSTM	0.545	1.031	0.906	23,040.05

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Hasan, S.M.; Raj, M.G.S.; Mitra, K. Comparative Evaluation of Deep Learning Models for Respiratory Rate Estimation Using PPG-Derived Numerical Features. Electronics 2026, 15, 1108. https://doi.org/10.3390/electronics15051108

AMA Style

Hasan SM, Raj MGS, Mitra K. Comparative Evaluation of Deep Learning Models for Respiratory Rate Estimation Using PPG-Derived Numerical Features. Electronics. 2026; 15(5):1108. https://doi.org/10.3390/electronics15051108

Chicago/Turabian Style

Hasan, Syed Mahedi, Mercy Golda Sam Raj, and Kunal Mitra. 2026. "Comparative Evaluation of Deep Learning Models for Respiratory Rate Estimation Using PPG-Derived Numerical Features" Electronics 15, no. 5: 1108. https://doi.org/10.3390/electronics15051108

APA Style

Hasan, S. M., Raj, M. G. S., & Mitra, K. (2026). Comparative Evaluation of Deep Learning Models for Respiratory Rate Estimation Using PPG-Derived Numerical Features. Electronics, 15(5), 1108. https://doi.org/10.3390/electronics15051108

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Comparative Evaluation of Deep Learning Models for Respiratory Rate Estimation Using PPG-Derived Numerical Features

Abstract

1. Introduction

2. Methods

2.1. Dataset and Feature Preparation

2.2. Windowing Strategy and Data Partitioning

2.3. Deep Learning Models

2.3.1. Deep Feedforward Neural Network (DFNN)

2.3.2. Recurrent Neural Network (RNN)

2.3.3. Long Short-Term Memory (LSTM)

2.4. Model Training

3. Results

3.1. Performance of DFNN Baseline

3.2. Performance of Recurrent Neural Network Models

3.3. Performance of LSTM-Based Architectures

3.4. Comparative Analysis and Runtime Evaluation

4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI