Highlights
What are the main findings?
- A seven-dimensional data preprocessing technique based on physiological signal priors enhances cross-dataset prediction accuracy. Expanding the raw one-dimensional signal into (amplitude, width, rise/fall time, first/second derivatives, raw signal) dimensions significantly improves waveform prediction performance. For example, on the CHARIS dataset, CBAnet reduced RMSE and MAE by approximately 45% and 50%, respectively, compared to the suboptimal model, while R2 improved by around 39%.
- CBAnet achieves inter-waveform prediction by unifying local morphology and long-range dependencies. This design achieves the best overall RMSE/MAE/R2 metrics among BiLSTM, CNN-LSTM, Transformer, and Wave-U-Net, with significantly superior peak/phase fidelity and temporal consistency compared to other models. In training on the CHARIS dataset, CBAnet achieved an RMSE of 0.4903 and an R2 of 0.8451, surpassing the performance of other models.
What are the implication of the main finding?
- Advancing real-time non-invasive monitoring. CBAnet’s moderate parameter count and efficient inference enable continuous bedside/edge deployment as a viable solution for intracranial pressure/blood pressure waveform estimation.
- Cross-dataset generalization capability and clinical utility. Achieved consistent improvements across GBIT-ABP, CHARIS, and PPG-HAF datasets.
Abstract
The real-time, precise monitoring of physiological signals such as intracranial pressure (ICP) and arterial blood pressure (BP) holds significant clinical importance. However, traditional methods like invasive ICP monitoring and invasive arterial blood pressure measurement present challenges including complex procedures, high infection risks, and difficulties in continuous measurement. Consequently, learning-based prediction utilizing observable signals (e.g., BP/pulse waves) has emerged as a crucial alternative approach. Existing models struggle to simultaneously capture multi-scale local features and long-range temporal dependencies, while their computational complexity remains prohibitively high for meeting real-time clinical demands. To address this, this paper proposes a physiological signal prediction method combining composite feature preprocessing with multiscale modeling. First, a seven-dimensional feature matrix is constructed based on physiological prior knowledge to enhance feature discriminative power and mitigate phase mismatch issues. Second, a network architecture CNN-LSTM-Attention (CBAnet), integrating multiscale convolutions, long short-term memory (LSTM), and attention mechanisms is designed to effectively capture both local waveform details and long-range temporal dependencies, thereby improving waveform prediction accuracy and temporal consistency. Experiments on GBIT-ABP, CHARIS, and our self-built PPG-HAF dataset show that CBAnet achieves competitive performance relative to bidirectional long short-term Memory (BiLSTM), convolutional neural network-long short-term memory network (CNN-LSTM), Transformer, and Wave-U-Net baselines across Root Mean Square Error (RMSE), Mean Absolute Error (MAE), and Coefficient of Determination (R2). This study provides a promising, efficient approach for non-invasive, continuous physiological parameter prediction.
1. Introduction
Real-time monitoring of physiological signals holds significant clinical importance. Through real-time monitoring systems, nursing staff or clinicians can continuously track patient conditions even when located in different rooms or floors, enabling swift intervention when conditions deteriorate. In intracranial pressure (ICP) management, continuous remote observation and timely alerts are particularly critical for preventing secondary brain injury. Recent studies indicate that wearable and environmental sensing technologies enable stable cross-spatial tracking, providing a vital technical foundation for clinical telemetry systems [1,2,3]. However, due to technical limitations of traditional methods, continuous and non-invasive measurement of certain critical vital signs remains challenging. Kamanditya et al. noted that patient motion artifacts significantly compromise the accuracy of blood pressure (BP) readings, particularly in settings requiring continuous monitoring such as intensive care units. Even minimal movement during measurement can introduce errors in blood pressure estimates when conventional methods are used. These motion-induced inaccuracies are especially pronounced in intensive care or prolonged monitoring environments, where patient activity is unavoidable and traditional methods often fail to account for such interference [4,5,6,7]. The gold standard for ICP monitoring involves inserting a pressure catheter into the ventricles. This invasive procedure carries high risks, is time-consuming, and may result in infection rates of 5–14% and bleeding rates of 5–7% [8,9,10,11]. Photoplethysmography (PPG) has been widely adopted as a non-invasive alternative for monitoring blood circulation. However, PPG waveforms obtained from different body locations exhibit variations in amplitude and morphology. For instance, fingertip PPG signals offer the highest quality, whereas signals from areas such as the forehead demonstrate smaller amplitudes and slightly different waveforms [12,13,14]. These factors all increase the difficulty of accurately predicting physiological signals.
In recent years, deep learning has provided new approaches to address the aforementioned challenges by predicting signals that are difficult to obtain directly through traditional methods, using readily accessible physiological indicators. Convolutional neural networks (CNN) excel at extracting local features, making them well-suited for processing detailed patterns in waveforms such as PPG [15]. Long Short-Term Memory (LSTM) networks excel at capturing temporal dependencies and demonstrate outstanding performance in processing time-series signals such as heartbeat sequences and blood pressure fluctuations [16,17]. Attention mechanisms can highlight key features within vast amounts of information, enhancing the model’s focus on crucial temporal segments [18]. Transformers are better at extracting spatiotemporal information [19]. Wave-U-Net bypasses traditional time-frequency transformations by directly processing raw time-domain waveforms, thereby better preserving signal details and phase information. This makes it more suitable for end-to-end learning waveform generation [20]. Many researchers have conducted studies on this topic, such as using optical PPG signals to estimate continuous blood pressure [21,22]. Alternatively, predicting ICP through BP measurements enables early warning of intracranial hypertension [11,23].
However, these methods currently have several limitations: many existing studies focus on predicting blood pressure values rather than actual waveforms, thereby losing the temporal characteristics of physiological signals. Furthermore, physiological signals vary even for the same event across different individuals or during different physiological stages of the same individual, making accurate prediction extremely challenging. Therefore, designing a high-precision fusion architecture to address the aforementioned issues is crucial. Research in many fields has also begun to employ multiple neural networks in parallel [24,25].
Based on the aforementioned motivations, this paper proposes the physiological signal prediction model CBAnet. Compared with existing methods, the innovations of this work are primarily reflected in the following three aspects. First, unlike traditional methods that rely solely on raw PPG waveforms, we introduce a composite feature construction method based on physiological signal priors, extracting more discriminative input representations from multiple dimensions such as waveform morphology and trend changes. Second, overcoming the limitations of single-architecture approaches, CBAnet integrates multi-scale convolutions, temporal modeling, and attention mechanisms to achieve a more refined characterization of waveform dynamics. Additionally, addressing the lack of concurrently collected head and fingertip PPG signals in existing public datasets, we supplement our analysis with the self-built dataset PPG-HAF. The main contributions are as follows:
- (1)
- Proposing a data preprocessing method based on physiological signal priors. By integrating physiological signal features, it transforms one-dimensional signals into a 7-dimensional feature matrix, effectively enhancing training efficiency and accuracy. For instance, in the CHARIS dataset, CBAnet’s RMSE and MAE were reduced by approximately 45% and 50%, respectively, whilst the R2 improved by around 39%.
- (2)
- Designed the high-precision fusion framework CBAnet, which unifies multi-scale local features with long-range dependency modeling. Attention mechanisms enable global reweighting and alignment, reducing errors in non-autoregressive end-to-end training while improving waveform morphology and phase fidelity. In the training results of the CHARIS dataset, CBAnet achieved an RMSE of 0.4903 and an R2 of 0.8451, outperforming other models.
- (3)
- A real-time PPG acquisition system is established, alongside the head-finger synchronized dataset PPG-HAF, forming a complete acquisition-denoising-output workflow that provides high-quality data support for model training and evaluation.
The remainder of this paper is organized as follows. In Section 2, a brief review of relevant prior work on the methodology is provided. In Section 3, the materials and datasets employed in this study are introduced. In Section 4, the proposed methodology is elaborated on in detail. In Section 5, the experimental setup and results are presented, followed by an in-depth discussion in Section 6. Finally, Section 7 concludes the entire paper and highlights potential directions for future research.
2. Related Works
In recent years, deep learning techniques have revolutionized advances in physiological signal processing and medical prediction tasks. This section reviews deep learning-based research on physiological signal prediction, focusing on the evolution of model architectures from early single networks to complex hybrid models, whilst analyzing their contributions and limitations.
The inception of deep learning applications in physiological signal prediction can be traced back to researchers transferring established computer vision and natural language processing models to the medical domain. Early work aimed to utilize deep learning for the automatic extraction of features from raw signals—such as PPG and electrocardiography (ECG)—thereby replacing laborious manual feature engineering reliant on expert knowledge. For instance, Schlesinger et al. utilized the MIMIC-III database to explore methods combining CNNs with Siamese networks, offering novel approaches for achieving individualized calibration. These pioneering studies demonstrate the immense potential of data-driven methods in physiological signal modelling [26]. Early research predominantly focused on exploring the efficacy of single, mature deep learning architectures in predicting physiological signals. CNN were widely employed owing to their robust local feature extraction capabilities. Wang et al. proposed an end-to-end framework combining a CNN with a gated recurrent unit (GRU), which takes a single normalized PPG waveform as input and can simultaneously model local features and temporal dependencies [27]. Wang and Ji further demonstrated the potential of pure CNN models in blood pressure prediction, with the advantage of eliminating cumbersome manual feature extraction. However, the drawbacks of such pure CNN models lie in their reliance solely on local convolutions, which inadequately capture signals with extreme ranges, and their requirement for large-scale data to ensure generalization capability [28].
To more effectively model the temporal dependencies of physiological signals, Long Short-Term Memory (LSTM) networks and their variants have been introduced. Zhang et al. proposed a method employing bidirectional LSTMs (BiLSTM) combined with demographic information, demonstrating good accuracy across different cardiovascular disease cohorts [29]. However, the generalizability of this approach may be constrained by dataset scale and diversity. The U-Net architecture, with its encoder–decoder structure and skip connections, has demonstrated strong performance in signal waveform reconstruction tasks. Athaya and Choi pioneered the application of U-Net for non-invasive arterial blood pressure waveform estimation, demonstrating accuracy comparable to conventional cuff-based methods [30]. A limitation lies in the model’s potentially poor adaptability across different patients, with constrained robustness in complex clinical scenarios such as acute condition changes. Subsequent research, such as the Wave-U-Net model proposed by Lei et al., has successfully been employed to reconstruct intracranial pressure waveforms from arterial blood pressure signals, validating the architecture’s advantages in integrating multi-scale contextual information with local details [31]. In recent years, Transformer models have garnered attention for their robust global context capture capabilities. Ma et al. employed a Transformer architecture combined with knowledge distillation techniques, achieving excellent results in cuffless blood pressure estimation [32]. Ma et al. further innovated by employing a Transformer-based transfer learning approach, demonstrating robust performance across diverse datasets [33]. However, Transformer-based methods typically rely on substantial training datasets, and their transfer learning strategies impose stringent requirements on data quality and features, rendering their application to small-sample datasets challenging.
To overcome the limitations of single models, researchers have turned to hybrid models that combine the strengths of multiple networks and explore multimodal signal fusion. The fusion of CNNs and RNNs represents one of the most common and effective hybrid strategies, with CNNs responsible for spatial/morphological feature extraction and RNNs (such as LSTMs or GRUs) handling temporal modelling. Panwar et al. designed long-term recurrent convolutional networks for multi-task prediction [34]. Jeong and Lim employed CNN-LSTM for multi-task learning [35]. Zhang et al. achieved high standards in blood pressure estimation using a CNN-LSTM model, demonstrating excellent accuracy [36]. A drawback of such models lies in their substantial computational resource and data volume requirements. The BP-CRNN (Convolutional Recurrent Neural Network) proposed by Leitner et al. and the dual-stream CNN-LSTM architecture introduced by Shaikh and Forouzanfar further validate the hybrid paradigm’s advantages in feature extraction completeness and sequence modelling capability [37,38]. Attention mechanisms were introduced to enhance the model’s ability to focus on critical signal segments. El-Hajj and Kyriacou constructed a model incorporating bidirectional recurrent units and attention layers, significantly enhancing the stability and robustness of predictions [39]. Aguirre et al. proposed a framework based on a sequence-to-sequence architecture integrating attention mechanisms, directly converting PPG signals into blood pressure waveforms [40]. Xiang et al. proposed the McBP-Net model (based on CNN-LSTM) for multimodal signal fusion, demonstrating remarkable performance in dynamic blood pressure estimation [41]. However, its reliance on multiple sensors increases equipment costs and operational complexity. Tian et al. proposed the Parallel Convolutional and Transformer Network (PCTN), which employs dual branches to extract local and global features, respectively [42]. Their attention fusion module significantly enhances the capture of complex blood pressure fluctuation patterns. Zhu et al. introduced a multidimensional Transformer-LSTM-GRU fusion model for epilepsy prediction, offering novel insights for physiological signal forecasting [19]. The 1-D SENet-LSTM approach proposed by Gengjia Zhang et al. further enhances prediction accuracy by strengthening feature learning through a channel attention mechanism [43]. Research has also explored integrating deep learning with traditional machine learning approaches. Rastegar et al. proposed a hybrid model combining CNNs and Support Vector Regression (SVR), capable of automatically extracting features from ECG and PPG signals to achieve high-precision predictions. However, its computational complexity may limit real-time applications [44].
In summary, research into deep learning-based physiological signal prediction has evolved from the application of single models to complex, sophisticated hybrid architectures. As highlighted in reviews by Le et al. and Pilz et al., integrating physiological prior knowledge with deep learning represents a crucial future direction [45,46]. Although existing models have achieved remarkable results on specific datasets, they still face common challenges. These include the models’ generalization capabilities across different populations (such as the issues of signal quality and individual variation highlighted by Mejía-Mejía et al.), and their dependence on computational resources. These shortcomings represent the key points this study aims to address and improve [47].
3. Material
This paper utilizes one self-built dataset and two public datasets as experimental data. The following sections will provide detailed descriptions of the datasets and experimental settings.
3.1. Self-Built Dataset
To capture a broader range of physiological signals, this paper designed a real-time PPG monitoring system that simultaneously acquires PPG signals from both the head and fingertip, establishing a dataset (PPG HAF) containing synchronized PPG signals from these two locations. The system principle is illustrated in Figure 1. The device comprises two independent photodetection units, fixed to the head and fingertip, respectively. Each detection unit incorporates an 850 nm central wavelength light-emitting diode (LED) light source and an avalanche photodiode (APD) detector, enabling non-invasive, continuous, and synchronous acquisition of PPG waveforms from both the head and fingertip. Signal acquisition is performed using a Keysight high-speed oscilloscope (Keysight, Santa Rosa, CA, USA) at a sampling frequency of 100,000 Hz.
Figure 1.
Signal Acquisition Schematic Diagram and Experimental Setup.
Participants were healthy adult volunteers with no history of cardiovascular disease who had not recently consumed medications or alcohol that could affect physiological signals. All subjects received detailed explanations of the experimental procedures and signed informed consent forms. Prior to the experiment, participants rested quietly in comfortable chairs for at least 10 min to stabilize resting heart rate and circulatory status. Each experimental recording lasted approximately 20 s to obtain sufficient signal duration for subsequent analysis. Specifically, this study included 8 subjects; detailed demographic information for all participants is provided in Table A1. All experiments were conducted in a temperature-controlled dark room to ensure environmental consistency.
After the experiment, the raw data underwent rigorous noise-reduction filtering preprocessing comprising three steps. First, a cubic spline interpolation method was employed to eliminate abrupt noise spikes caused by body movement or poor device contact. Second, morphological operations effectively removed baseline drift from the signals. Finally, a Butterworth filter (passband range approximately 0.5–10 Hz) was applied to suppress high-frequency electronic noise interference. Figure 2 shows a comparison of the raw signal before and after noise filtering. These preprocessing measures significantly improved the signal-to-noise ratio of the PPG signal, providing reliable data support for subsequent deep learning model training.
Figure 2.
Comparison of Signal Processing Before and After.
3.2. PublicData
To validate the model’s effectiveness, in addition to the self-built PPG HAF dataset, this paper also selected two public datasets. The first public dataset originates from an open database [48] (GBIT-ABP) that enables continuous cuffless monitoring of arterial blood pressure via a graphene bioimpedance electronic tattoo. This dataset comprises raw time series of four-channel bioimpedance signals, along with corresponding BP (sampled at 200 Hz) and PPG (sampled at 75 Hz) signals. This paper selected simultaneously recorded fingertip PPG and continuous blood pressure signals from this dataset to train and evaluate the model’s predictive capability for blood pressure waveforms.
The second set of publicly available data originates from the CHARIS database [49], which collects multi-channel physiological signal recordings from multiple patients diagnosed with traumatic brain injury (TBI), including ECG, blood pressure, and intracranial pressure. The acquisition system recorded signals from each channel at a sampling rate of 50 Hz, with high-frequency interference above 25 Hz filtered out. This study utilizes the simultaneously recorded BP and ICP waveform data from this database to evaluate the model’s ability to predict changes in intracranial pressure from arterial blood pressure waveforms.
3.3. Experimental Setup
All models undergo unified training and testing on the same dataset. The final 10,000 data points are reserved as the validation set, with the remaining data split into training and testing sets in an 8:2 ratio. Training samples were constructed using a sliding window approach with a window size of 700, stride of 100, and batch size of 32. This aims to select shorter time windows from longer sequences for end-to-end real-time prediction. Such sliding window processing effectively captures local signal features, ensuring the model can predict physiological data in real-time over brief intervals during practical applications. Input comprises preprocessed seven-channel physiological signals, with output being a one-dimensional physiological signal sequence matching the input length. The model employs Mean Squared Error Loss (MSELoss) as its objective function, utilizing the AdamW optimizer, with each dataset undergoing 100 epochs of training. All experiments were implemented within the PyTorch 2.6.0 framework using an NVIDIA RTX 4070 ti GPU.
4. Methods
Generally speaking, physiological signals exhibit pronounced continuity, with their values influenced not only by the current moment but also by coupling across multiple preceding and subsequent cardiac cycles, alongside low-frequency modulations such as respiration and autonomic nervous activity. Consequently, deep learning prediction faces significant challenges including strong nonlinearity, cross-periodic dependencies, inter-individual variability, and variable time lags (phase mismatch). We must simultaneously account for the relationship between local features such as rising edges and heavy beat notches and the overall low-frequency background, making accurate physiological signal prediction highly challenging.
To address the aforementioned challenges, this paper proposes CBAnet, an end-to-end prediction framework that integrates composite feature preprocessing with multi-scale modeling. The overall framework, illustrated in Figure 3, comprises three components: data preprocessing, deep temporal modeling, and result output.
Figure 3.
CBAnet Overall algorithm flow diagram.
First, data preprocessing is performed on the input side by introducing a 7-dimensional composite feature set based on physiological signal characteristics (raw waveform, amplitude, pulse width, rise time, fall time, first derivative, second derivative). This approach preserves the original information while incorporating prior features to reduce learning complexity and enhance computational efficiency.
Subsequently, one-dimensional convolutions are employed to extract multi-scale local features while suppressing noise. Bidirectional long-short-term networks are introduced to model long-term dependencies and sequential biases. Combined with multi-head self-attention, this approach redistributes information across positions and performs adaptive alignment throughout the entire time window. This highlights key segments and mitigates misalignment, balancing amplitude accuracy with dynamic consistency while enhancing robustness.
Finally, data integration and denormalization mapping achieve high-precision long-term regression prediction.
This design advances feature learning by incorporating prior features, thereby reducing parameter complexity and optimization difficulty. Concurrently, the synergy between multi-scale local feature extraction and long-term temporal modelling, coupled with the dynamic alignment mechanism of attention, significantly enhances prediction accuracy for complex nonlinear time series regression tasks.
4.1. Overview
Specifically, we denote the raw one-dimensional physiological signal as . First, we perform data preprocessing on the raw signal to obtain the aligned seven-channel feature matrix . We then slice into samples based on window length T and stride S, yielding :
Inputting the processed sample into the network, our architecture comprises three sub-networks: local time-domain feature subnet , long-term dependency representation subnet , and global reweighted alignment . Building upon this foundation, the attention-reweighted representation is mapped to a prediction sequence through the time-step-by-step linear mapping module , forming the overall inference architecture of the network:
Here, denotes function composition, and represents the physiological signal sequence to be predicted, matching the input in length. By combining these three subnetworks, we achieve progressive modeling through a “local-global-alignment” approach: the convolutional layer focuses on fine-grained features like rising edges and reentrant notches; the bidirectional long-short-term memory module provides sequential and long-term memory priors; and the self-attention mechanism retrieves the most relevant evidence fragments globally at each time step. Experiments demonstrate that our composite architecture effectively mitigates individual variations, noise interference, and temporal mismatches.
During the training phase, mean squared error is employed as the primary loss function for end-to-end optimization of the network parameters :
Here, denotes the actual physiological signal sequence to be predicted, while represents the model with parameter . This objective balances local morphological fidelity with global trend consistency whilst maintaining controllable parameterization and training stability, thereby providing a structural foundation for subsequent analysis.
We present more details of our CBAnet in Algorithm 1:
| Algorithm 1: CBAnet |
Step 1: Train the network 1: Data Preprocessing . 4: Repeat Until convergence Step 2: Process the prediction results 5: Obtain the prediction result for each size. 6: Read out linearly along step size to obtain the complete prediction result matching the original input signal length. 7: Inverse normalization and output the evaluation metrics (RMSE/MAE/R2). Output: 1-dimensional prediction results with the same length as the original input length. |
4.2. Data Preprocessing
Previous studies have demonstrated significant temporal lags and phase shifts between different physiological signals [50], with substantial variations in heart rate and phase differences across subjects. This readily introduces inter-period mismatches [51,52,53] and generates annotation noise [54,55]. Therefore, to enhance model performance, preprocessing of the raw signal sequence based on cardiac cycle features is required prior to training. This aims to simultaneously preserve both periodic structure and transient dynamic information, thereby improving training speed and accuracy.
Abay et al. employed features such as PPG signal amplitude, pulse width, and rise/fall times to predict intracranial pressure [56]. This demonstrates that these characteristics are closely correlated with intracranial pressure, hence we have selected the same features. Considering that these features reflect periodic variations in the time domain, we further introduced first-order and second-order derivatives as additional feature values. First, the entire sequence was segmented according to cardiac cycles. Four key temporal parameters were extracted from each cycle: amplitude (peak-to-trough, reflecting pulsatile intensity), pulse width (interval between adjacent peaks/troughs, reflecting cycle length), rise time (trough to peak, reflecting contraction dynamics), and fall time (peak to next trough, reflecting diastolic decay). Additionally, we computed first- and second-order derivatives at each sampling point. The first derivative characterizes the rate of change in the signal, while the second derivative captures curvature information, highlighting points of abrupt change. These parameters effectively characterize key features of the periodic motion inherent in physiological signals. The overall workflow is illustrated in Figure 4.
Figure 4.
Data Preprocessing Flowchart.
These feature values are extended to map to the sampling length of the original signal, forming a point-by-point aligned feature matrix alongside the original signal as input for downstream models. Specifically, the trough and peak positions of the -th cycle of signal are denoted as , while the boundary point of the -th cycle is denoted as . The specific feature values are shown in Table 1 below:
Table 1.
Features obtained from preprocessing.
This preprocessing not only precisely characterizes the periodicity and dynamic features of signals along the time axis but also preserves information about both macro-level rhythms (periodic structures) and micro-level morphology (instantaneous derivatives). This enables the model to better capture the intrinsic patterns of signals during physiological signal prediction, facilitating stable convergence and improved generalization in subsequent learning, thereby achieving more accurate prediction results.
4.3. Deep Temporal Modeling Network
The schematic diagram of the deep temporal modeling network is shown in Figure 5, primarily divided into the following section: Local time-domain feature subnet (LFE), Long-Term Dependency Representation Subnet (TRN), and Global reweighting and alignment (GRN). The LFE aims to stably extract low-level features characterizing waveform geometric structures without sacrificing temporal resolution. To this end, this paper designs a two-layer lightweight feature extraction network LFE. The first layer can be understood as a combined approximation of the first-order local difference and smoothing of the input. The three-point neighborhood suppresses high-frequency random noise while preserving the local gradient information of rising/falling edges, as shown below:
Figure 5.
Schematic Diagram of Deep Temporal Modeling Network.
The second layer performs nonlinear combinations while maintaining the sequence length, re-encoding “local geometric features” to enhance the distinctiveness of features at each time step. As shown below:
To mitigate scale differences between features from diverse sources, this paper employs ReLU as the activation function. This ensures gradient stability while enhancing the separability of low-amplitude events. This high temporal fidelity strategy aligns closely with the critical discernment points of physiological waveforms. Consequently, without subsampling, we aggregate local neighborhood information using short kernel convolutions, suppressing high-frequency noise while preserving the waveform’s local geometric structure and phase information. Conversely, stride subsampling or pooling, while expanding receptive fields, tends to average out details and introduce phase errors. LFE amplifies small-scale geometric differences into discriminative features through its “short kernel-equal length-shallow layer” combination, providing morphologically stable inputs for subsequent processing.
Local convolutions alone cannot capture slow-varying modulations and variable time lags spanning extended periods. For instance, respiratory rhythms, changes in vascular compliance, and neurohumoral regulation exert slow, persistent effects on waveforms across multiple cardiac cycles. Furthermore, even identical events exhibit variable spatial locations across different individuals or physiological states within the same subject. Therefore, building upon features extracted from LFE, we stack bidirectional LSTM to form TRN architecture. As shown in the equation, let , then:
Capture causal dependencies forward (from past to present) and supplement counterfactual clues backward (from future to present); the combination yields a symmetric description of the present moment.
Among these, represents the unidirectional hidden dimension, and bidirectional concatenation yields dimension representation. Unlike fixed receptive fields achieved solely through convolution, TRN can better capture cross-period long-range dependencies without compromising temporal resolution.
Although TRN can already model longer time periods, it primarily relies on recursion between adjacent positions. In scenarios involving phase jitter, irregular periods, or rhythm drift, relying solely on recursion may introduce temporal step errors. To address this, we introduce an attention mechanism to construct the GRA network. This enables features from any two temporal steps within a sequence to be directly “visible” to each other, better aligning with the physical characteristics of temporal signals. As shown in the equation:
In GRN, global soft alignment is achieved through attention weights normalized by softmax, which aggregates the most relevant temporal segments into the current representation via convex combination.
Here, represents the single-head dimension and denotes the attention weight. Through GRA, the model performs reweighting and soft alignment across the entire window, thereby mitigating amplitude and phase errors caused by cross-period mismatch and enhancing sensitivity to boundary moments.
4.4. Pointwise Regression Output
After completing global alignment, we employ a time-step-by-time-step linear readout to map the high-dimensional representation to target values. This approach preserves the interpretability of the upstream representation while minimizing parameters to reduce overfitting risk. The final output is obtained through the time-step-by-time-step linear mapping :
The loss function primarily employs mean squared error to ensure consistent amplitude at each point. Simultaneously, it incorporates weakly weighted shape and spectral consistency constraints to impose gentle limitations on slope and frequency band distribution, thereby preventing scenarios where similar amplitudes result in distorted shapes. As shown in the following equation:
Among these, represents the first-order difference loss, denotes the weight, signifies the spectral consistency loss, indicates the weight, denotes the correlation term, and represents the weight.
Both training and inference employ a sliding window strategy with fixed window length: during training, a fixed window slides across long sequences to prevent information leakage. During inference, overlapping regions undergo smooth fusion to ensure the concatenated prediction sequence is strictly aligned with the ground truth sequence along the temporal axis.
4.5. Implementation Details
To validate the effectiveness of the proposed model, we compare it with four commonly used models: BiLSTM [57], CNN-LSTM [4], Transformer [22], and Wave-U-Net [58]. BiLSTM employs bidirectional recurrent layers to simultaneously model past and future dependencies in the temporal dimension, representing a classic sequence modeling approach. It is particularly effective at capturing rhythmic information such as the morphology of cardiac cycles and respiratory modulation. CNN-LSTM first extracts local features (e.g., wave peaks, rising edges) via front-end convolutions, then aggregates global temporal patterns through long-short term memory networks, enabling better utilization of local feature characteristics than pure LSTM. Transformers establish long-range dependencies through self-attention, making them better suited for modeling correlations spanning multiple cardiac cycles. Wave-U-Net employs a multi-scale encoder-encoder architecture to achieve end-to-end “waveform-to-waveform” mapping, simultaneously capturing local transients and global trends, making it ideal for end-to-end modeling of physiological signals.
To evaluate the performance of different methods in predicting physiological signals, this paper selected three commonly used evaluation metrics: Root Mean Square Error (RMSE), Mean Absolute Error (MAE), and Coefficient of Determination (R2). These metrics measure the deviation between model predictions and actual values, as well as the model’s goodness of fit, from different perspectives. They are widely used in evaluating the effectiveness of time series forecasting models.
- RMSE is defined as the square root of the average of the squared deviations between the predicted value and the true value. Its mathematical expression is as follows, reflecting the average level of deviation between the predicted value and the true value.
- 2.
- MAE is defined as the average of the absolute values of prediction errors, with the following mathematical expression. It measures the average magnitude of the error between the model’s predicted values and the actual values.
- 3.
- The R2 measures the proportion of variance in the dependent variable explained by the model. It is commonly used to assess the goodness of fit of a regression model. Its formula is defined as the ratio of the sum of squares of actual values to the sum of squares of predicted values, as follows.
Since physiological signal prediction constitutes a typical regression problem, we employ RMSE and MAE to quantify the magnitude deviation between model predictions and actual values, while using R2 to measure the model’s ability to fit the trend of the target signal. In prediction tasks such as continuous blood pressure or intracranial pressure monitoring, these metrics provide an intuitive reflection of the model’s error magnitude and reliability, serving as common standards for evaluating algorithm performance in relevant fields.
5. Experiment Results
Training the three datasets using the aforementioned method yielded the following results.
5.1. Data Preprocessing Results
To validate the necessity of data preprocessing, we conducted experiments on three datasets (PPG HAF, GBIT-ABP, and CHARIS) using CBAnet and a baseline model. We input both unprocessed and preprocessed data. We define:
Table 2 summarizes the prediction performance of five models across three datasets, comparing results between raw data and preprocessed data . Experimental results demonstrate that data preprocessing effectively enhances the performance of all five model types. Post-preprocessing models exhibit consistent reductions in RMSE and MAE alongside sustained increases in R2, with CBAnet delivering particularly outstanding results on the CHARIS dataset: RMSE decreased by 45.9%, MAE by 50.3%, and R2 rose to 0.9345. These findings indicate that preprocessing mitigates phase mismatch and amplitude drift issues by explicitly extracting morphological and trend features related to the cardiac cycle. This enables models to simultaneously capture both local structural details and low-frequency background information, thereby significantly enhancing their generalization capabilities.
Table 2.
Parameters such as RMSE, MAE, and R2 for different networks across three datasets.
5.2. Five-Fold Cross-Validation
To enhance the model’s generalization capabilities and achieve more accurate results, this section employs five-fold cross-validation for dataset training. By averaging the results across each fold, a stable assessment of the model’s overall performance is obtained. Subsequently, the optimal parameters derived from training were applied to predict values for a pre-segmented validation set comprising 10,000 points. This ultimately yielded the comprehensive training outcomes for each model. Partial results are shown in Table 3 below; the complete results can be found in the Table A2:
Table 3.
The results of the validation sets obtained from the three datasets.
Experimental results across three datasets demonstrate that CBAnet achieves optimal performance. Specifically, on the PPG-HAF dataset, CBAnet achieves a validation set RMSE of 3.4976 and an R2 of 0.8212, outperforming all comparison models. Wave-U-Net and CNN-LSTM followed, while Transformer slightly underperformed CBAnet across multiple metrics. On the GBIT-ABP dataset, CBAnet achieved the best validation RMSE of 7.1992 and R2 of 0.8938. BiLSTM and Wave-U-Net yielded relatively good results, but CNN-LSTM and Transformer exhibited significant errors. On the CHARIS dataset, CBAnet again demonstrated outstanding performance, achieving a validation set RMSE of 0.4903 and an R2 of 0.8451, comprehensively outperforming all other models. Combining the validation set metric comparisons shown in Figure 6 and Figure 7 with the visualized prediction waveforms, CBAnet demonstrates superior signal fitting capability and stability across all datasets. Its prediction curves closely match the true values. Based on comprehensive evaluation metrics, CBAnet exhibits a clear advantage in the blood pressure waveform reconstruction task.
Figure 6.
Bar charts of RMSE, MAE, and R2 parameters for five networks across three datasets.
Figure 7.
Comparison of actual and predicted signals across three datasets: (a) GBIT-ABP, (b) CHARIS, (c) PPG HAF.
In contrast, the CNN-LSTM model exhibits a drawback of elevated peak amplitude in its prediction curve. This stems from the limited expressive capability of convolutions in capturing sharp points under subsampling and fixed receptive fields. While BiLSTM leverages bidirectional long short-term memory to capture both forward and backward temporal information, it lacks the detailed extraction capabilities of convolutions, resulting in slight deficiencies in reconstructing subtle waveforms. Transformers, as a popular recent architecture for sequence modeling, excel at capturing long-range dependencies. However, they inadequately represent local details during training, leading to slightly elevated peaks. This may stem from their reliance on large datasets and reduced sensitivity to minute morphological variations compared to local convolutions and recurrent memory structures. The Wave-U-Net model extracts multi-scale features through its encoder-encoder convolutional structure. However, the multi-layer sampling process causes partial loss of high-frequency detail information, resulting in insufficient output waveform detail. Additionally, this architecture lacks realistic temporal sequence modeling, exhibits significant deficiencies in time alignment, and suffers from slow inference speeds, making it unsuitable for real-time deployment.
5.3. Bland–Altman Analysis
To further evaluate the consistency between model predictions and reference signals, this study conducted Bland–Altman analyses on three datasets: GBIT-ABP, CHARIS, and PPG-HAF. Systematic bias and residual dispersion were quantified by calculating the mean bias and 95% limits of agreement (LoA), as shown in Table 4, Table 5 and Table 6. Figure 8, Figure 9 and Figure 10 present the corresponding scatter plots and error histograms, illustrating the distribution of differences between predicted and actual values [59].
Table 4.
Bland–Altman analysis of the GBIT ABP database under different networks.
Table 5.
Bland–Altman analysis of the CHARIS database under different networks.
Table 6.
Bland–Altman analysis of the PPG HAF database under different networks.
Figure 8.
Bland–Altman analysis on the GBIT ABP dataset for five networks.
Figure 9.
Bland–Altman analysis on the CHARIS dataset for five networks.
Figure 10.
Bland–Altman analysis on the PPG HAF dataset for five networks.
Results show that on the GBIT-ABP dataset, CBAnet exhibits the narrowest consistency interval and smallest bias, with bias = −0.127 mmHg, SD = 7.2 mmHg, and a 95% LoA of [−14.24, 13.98] mmHg. In contrast, CNN-LSTM and BiLSTM exhibited larger consistency intervals of approximately ±22 mmHg with greater variability.
On the CHARIS dataset, CBAnet again demonstrated optimal performance with bias = −0.532 mmHg, SD = 0.69 mmHg, and a 95% LoA of [−1.88, 0.82] mmHg—the narrowest among all models. In contrast, Wave-U-Net and CNN-LSTM exhibited larger biases and asymmetric distributions, while CBAnet’s error points were uniformly distributed around zero with no significant proportional bias.
On the PPG-HAF dataset, CBAnet similarly maintained the smallest bias (bias = −0.038 mV) and narrowest consistency interval ([−6.89, 6.82] mV), with error distribution approximating normal and concentrated near zero; other models generally exhibited error ranges exceeding ±7 mV, featuring broader and more skewed distributions.
In summary, CBAnet demonstrated the smallest bias, smallest standard deviation (SD), and narrowest consistency limits across all three datasets, indicating its predictions closely align with actual signals. These results validate CBAnet’s capability to minimize systematic errors and its cross-scenario stability, establishing a reliable foundation for its clinical application in non-invasive physiological signal estimation.
CBAnet combines convolutional operations for detail extraction, bidirectional capture of long-term dependencies, and attention mechanisms focused on critical time points, effectively integrating local and global features. Its moderate parameter size enables more efficient training and facilitates robust generalization under limited data conditions. As a result, CBAnet balances waveform details with overall trends, producing prediction curves that closely match actual curves. Its error metrics significantly outperform other models. For instance, on the CHARIS dataset, CBAnet reduces RMSE and MAE by approximately 45% and 50%, respectively, compared to the next-best model, while increasing R2 by about 39%, demonstrating substantial accuracy improvements. From an engineering and clinical usability perspective, CBAnet exhibits strong robustness, balancing waveform details with global trends to maintain stable outputs across diverse physiological signals. It also offers good interpretability, with visualizable attention weights aiding result interpretation and anomaly localization. Deployment-friendly with moderate computational overhead, it facilitates real-time applications.
6. Discussion
The proposed method combining composite feature preprocessing with multiscale modeling demonstrates favorable performance in end-to-end modeling and prediction of physiological signals such as intracranial pressure under the evaluated datasets and short-time window settings. First, composite feature preprocessing based on physiological prior knowledge significantly enhances model performance. We expanded the original one-dimensional waveform into a seven-dimensional feature matrix encompassing amplitude, pulse width, rise time, fall time, first-order derivative, and second-order derivative. This provides richer morphological and trend information from the input side. Experimental results indicate that introducing composite feature preprocessing significantly reduces model prediction errors, with decreases in both RMSE and MAE, and an increase in R2. This suggests that the preprocessing method enhances model learning efficiency and short-term prediction accuracy.
Secondly, the multi-scale modeling framework CBAnet designed in this paper can simultaneously capture both the local geometric details (such as steep rising edges, peaks, and troughs) and the overall low-frequency trends in BP/ICP waveforms. Across the three tested datasets, CBAnet achieved superior overall error and correlation metrics.
During the data processing phase of our self-built dataset, we employ cubic spline interpolation to eliminate spike noise caused by motion artifacts or poor sensor contact. Morphological filtering is then applied to remove baseline drift, followed by bandpass filtering at appropriate frequencies to suppress high-frequency electronic noise. These steps enhance the signal-to-noise ratio to a certain extent, contributing to the stability of model training and inference.
Despite the positive outcomes of this study, several limitations remain. Given the inherent challenges of non-invasive intracranial pressure measurement, overall accuracy remains constrained, and consistency with reference waveforms varies across different datasets. Furthermore, the current model has been validated using limited data from a single cohort, and experiments have focused on short-term monitoring. However, stability and accuracy during long-term monitoring (≥1 h) are critical. While model runtime indicates near-real-time processing potential, additional validation is required before clinical application—particularly for extended recording durations, diverse cohorts, and consistency analysis—to assess clinical applicability. Secondly, this study primarily addresses single signal prediction. Future research may explore multi-parameter cascade prediction frameworks, such as sequential prediction chains like “head PPG → fingertip PPG → blood pressure → intracranial pressure.” As high-quality, diverse datasets continue to accumulate, model architectures will be optimized to reduce dependence on specific data distributions. This will enhance prediction accuracy while further balancing model complexity and interpretability.
7. Conclusions
Physiological signals exhibit distinct continuity and cross-period coupling, while being influenced by low-frequency modulation, individual variations, and variable time delays. However, existing deep learning approaches struggle to simultaneously capture both local features—such as rising edges and beat notches—and the global consistency of low-frequency backgrounds. Therefore, addressing the clinical challenge of ‘inferring hard-to-measure indicators from easily accessible signals,’ this paper proposes CBAnet, a fusion architecture for end-to-end continuous prediction of physiological waveforms. First, a 7-dimensional composite feature preprocessing serves as the input gateway, extracting features relevant to physiological signal prediction: amplitude, pulse width, rise time, fall time, first derivative, and second derivative. This stage performs denoising and dimensionality expansion to reduce computational load. Subsequently, a local temporal feature extraction module extracts stable local pattern features across multiple scales, highlighting peak/troughs and rising/falling edges. Contextual dependencies are established by integrating short- and long-term dependency representation networks, proving particularly effective for targets with temporal delay characteristics. The global reweighting and alignment module employs multi-head self-attention to explicitly weight and align data within time windows, amplifying the contribution of segments containing sudden events (e.g., blood pressure surges, ICP spikes). Experimental results indicate that data augmentation preprocessing combined with signal features contributes to improving model accuracy and performance. Furthermore, across the three preprocessed datasets, CBAnet demonstrates overall superiority over traditional methods in metrics such as RMSE, MAE, and R2, highlighting its feasibility and potential value in end-to-end prediction.
Author Contributions
Conceptualization, P.C.; methodology, P.C.; software, P.C.; validation, P.C.; formal analysis, P.C.; investigation, P.C.; resources, P.C.; data curation, P.C.; writing—original draft preparation, P.C.; writing—review and editing, P.C.; visualization, J.L.; supervision, B.P.; project administration, Z.L.; funding acquisition, L.Z. All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded by Natural Science Basic Research Program of Shaanxi (2024JC-YBQN-0707).
Informed Consent Statement
Informed consent was obtained from all subjects involved in the study.
Data Availability Statement
The data supporting the results of this study are available upon reasonable request from the corresponding author.
Conflicts of Interest
The authors declare no conflicts of interest.
Abbreviations
The following abbreviations are used in this manuscript:
| BP | Blood Pressure |
| ICP | Intracranial Pressure |
| PPG | Photoplethysmography |
| CNN | Convolutional neural networks |
| LSTM< | Long Short-Term Memory |
| LFE | Local time-domain feature subnet |
| TRN | Long-Term Dependency Representation Subnet |
| GRN | Global reweighting and alignment |
| PPG HAF | PPG signals from the head and fingertip, constructing a dataset |
| LED | Light-emitting diode |
| RMSE | Root Mean Square Error |
| MAE | Mean Absolute Error |
| R2 | Coefficient of Determination |
| TBI | Traumatic Brain Injury |
| ECG | Electrocardiogram |
| CBAnet | CNN-LSTM-Attention |
| APD | Avalanche photodiode |
Appendix A
Table A1.
Detailed demographic information for all participants.
Table A1.
Detailed demographic information for all participants.
| Number | Gender | Age |
|---|---|---|
| Participant 1 | F | 27 |
| Participant 2 | F | 26 |
| Participant 3 | F | 40 |
| Participant 4 | F | 31 |
| Participant 5 | M | 29 |
| Participant 6 | M | 28 |
| Participant 7 | M | 29 |
| Participant 8 | M | 27 |
Table A2.
Five-fold cross-validation results across three datasets.
Table A2.
Five-fold cross-validation results across three datasets.
| Dataset: GBIT-ABP | |||
|---|---|---|---|
| Five-Fold Cross | RMSE | MAE | R2 |
| Model: CNN-LSTM | |||
| 1 | 19.6481 | 11.2642 | 0.3592 |
| 2 | 11.9563 | 8.5788 | 0.723 |
| 3 | 9.6708 | 6.5005 | 0.8061 |
| 4 | 17.6997 | 13.261 | 0.481 |
| 5 | 13.2958 | 7.9929 | 0.6468 |
| Average | 14.4541 ± 3.6871 | 9.5195 ± 2.4237 | 0.6032 ± 0.1624 |
| Validation | 11.3574 | 7.1874 | 0.7358 |
| Model: BiLSTM | |||
| 1 | 11.1869 | 6.5369 | 0.7911 |
| 2 | 7.9683 | 5.5917 | 0.8804 |
| 3 | 6.7019 | 5.4797 | 0.9057 |
| 4 | 1.0542 | 0.7833 | 0.998 |
| 5 | 8.159 | 5.3139 | 0.8679 |
| Average | 9.1169 ± 1.9165 | 6.3587 ± 1.3264 | 0.8447 ± 0.0506 |
| Validation | 8.1941 | 4.7464 | 0.8621 |
| Model: Transformer | |||
| 1 | 29.4893 | 13.2017 | −0.4434 |
| 2 | 12.0207 | 8.1412 | 0.72 |
| 3 | 9.8083 | 6.563 | 0.8005 |
| 4 | 15.4696 | 10.8204 | 0.6036 |
| 5 | 14.4795 | 9.1956 | 0.5811 |
| Average | 16.2535 ± 6.9052 | 9.5844 ± 2.2793 | 0.4525 ± 0.4549 |
| Validation | 11.8288 | 8.3201 | 0.7134 |
| Model: Wave-U-Net | |||
| 1 | 14.4539 | 7.755 | 0.6532 |
| 2 | 9.5356 | 7.1013 | 0.8238 |
| 3 | 6.8389 | 5.4155 | 0.903 |
| 4 | 15.7793 | 10.8239 | 0.5875 |
| 5 | 6.6259 | 4.9264 | 0.9123 |
| Average | 10.6467 ± 3.8142 | 7.2044 ± 2.0882 | 0.7760 ± 0.1324 |
| Validation | 11.4566 | 8.5814 | 0.7311 |
| Model: CBAnet | |||
| 1 | 9.841 | 6.4277 | 0.8393 |
| 2 | 8.1247 | 5.6151 | 0.8721 |
| 3 | 5.9226 | 4.5294 | 0.9273 |
| 4 | 13.5471 | 10.7827 | 0.696 |
| 5 | 7.9829 | 4.9686 | 0.8727 |
| Average | 9.0837 ± 2.5542 | 6.4647 ± 2.2518 | 0.8415 ± 0.0780 |
| Validation | 7.1992 | 4.7387 | 0.8938 |
| Dataset: CHARIS | |||
| Five-fold cross | RMSE | MAE | R2 |
| Model: CNN-LSTM | |||
| 1 | 0.6884 | 0.5566 | 0.7449 |
| 2 | 0.8496 | 0.6748 | 0.6084 |
| 3 | 0.6452 | 0.5259 | 0.7612 |
| 4 | 0.8183 | 0.6397 | 0.6291 |
| 5 | 0.7308 | 0.5585 | 0.7008 |
| Average | 0.7464 ± 0.0770 | 0.5911 ± 0.0563 | 0.6889 ± 0.0609 |
| Validation | 0.8602 | 0.7342 | 0.5233 |
| Model: BiLSTM | |||
| 1 | 0.6942 | 0.5621 | 0.7406 |
| 2 | 0.9114 | 0.7264 | 0.5494 |
| 3 | 0.6605 | 0.5384 | 0.7498 |
| 4 | 0.9092 | 0.7166 | 0.5421 |
| 5 | 0.7988 | 0.6061 | 0.6425 |
| Average | 0.7948 ± 0.1048 | 0.6299 ± 0.0779 | 0.6448 ± 0.0893 |
| Validation | 0.9924 | 0.8638 | 0.3655 |
| Model: Transformer | |||
| 1 | 0.7252 | 0.5811 | 0.7169 |
| 2 | 0.7781 | 0.6123 | 0.6715 |
| 3 | 0.6364 | 0.515 | 0.7677 |
| 4 | 0.8821 | 0.6926 | 0.5689 |
| 5 | 0.8384 | 0.6444 | 0.6061 |
| Average | 0.7721 ± 0.0862 | 0.6091 ± 0.0598 | 0.6662 ± 0.0721 |
| Validation | 0.9457 | 0.8174 | 0.4237 |
| Model: Wave-U-Net | |||
| 1 | 0.7086 | 0.5642 | 0.7297 |
| 2 | 0.8938 | 0.7191 | 0.5666 |
| 3 | 0.5845 | 0.4716 | 0.804 |
| 4 | 0.8683 | 0.677 | 0.5823 |
| 5 | 0.698 | 0.5421 | 0.727 |
| Average | 0.75007 ± 0.1153 | 0.5948 ± 0.0907 | 0.6819 ± 0.0921 |
| Validation | 1.1179 | 0.9741 | 0.1949 |
| Model: CBAnet | |||
| 1 | 0.6556 | 0.5174 | 0.7686 |
| 2 | 0.9657 | 0.734 | 0.4941 |
| 3 | 0.4608 | 0.3648 | 0.8782 |
| 4 | 0.6781 | 0.5422 | 0.7453 |
| 5 | 0.6674 | 0.5301 | 0.7504 |
| Average | 0.6855 ± 0.1614 | 0.5377 ± 0.1174 | 0.7273 ± 0.1363 |
| Validation | 0.4903 | 0.3997 | 0.8451 |
| Dataset: PPG-HAF | |||
| Five-fold cross | RMSE | MAE | R2 |
| Model: CNN-LSTM | |||
| 1 | 2.6433 | 1.9519 | 0.8025 |
| 2 | 2.2626 | 1.7423 | 0.8763 |
| 3 | 3.4935 | 2.6785 | 0.2403 |
| 4 | 3.3659 | 2.4883 | 0.6842 |
| 5 | 3.9997 | 3.1093 | 0.7107 |
| Average | 3.2372 ± 0.5144 | 2.4356 ± 0.4440 | 0.6230 ± 0.1965 |
| Validation | 3.8573 | 3.0476 | 0.7825 |
| Model: BiLSTM | |||
| 1 | 2.4171 | 1.819 | 0.8349 |
| 2 | 2.5687 | 1.889 | 0.7043 |
| 3 | 2.475 | 1.8614 | 0.6187 |
| 4 | 3.2477 | 2.441 | 0.706 |
| 5 | 4.0038 | 3.113 | 0.7101 |
| Average | 2.9424 ± 0.6089 | 2.2247 ± 0.4990 | 0.7148 ± 0.0691 |
| Validation | 3.9085 | 3.0642 | 0.7767 |
| Model: Transformer | |||
| 1 | 2.7632 | 2.1046 | 0.7842 |
| 2 | 2.6901 | 2.0463 | 0.6757 |
| 3 | 2.9591 | 2.2824 | 0.4549 |
| 4 | 3.4361 | 2.5889 | 0.6709 |
| 5 | 4.3171 | 3.4096 | 0.6629 |
| Average | 3.2331 ± 0.6012 | 2.4864 ± 0.4988 | 0.6497 ± 0.1071 |
| Validation | 3.9093 | 3.0106 | 0.7766 |
| Model: Wave-U-Net | |||
| 1 | 3.1245 | 2.4861 | 0.7256 |
| 2 | 2.445 | 1.8338 | 0.7326 |
| 3 | 2.8796 | 2.2751 | 0.4837 |
| 4 | 2.9414 | 2.261 | 0.7599 |
| 5 | 3.6587 | 2.8771 | 0.7574 |
| Average | 3.0099 ± 0.3937 | 2.3466 ± 0.3395 | 0.6919 ± 0.1049 |
| Validation | 3.6797 | 2.8806 | 0.806 |
| Model: CBAnet | |||
| 1 | 2.808 | 2.1624 | 0.7771 |
| 2 | 2.5971 | 1.9673 | 0.6977 |
| 3 | 2.4858 | 1.9141 | 0.6153 |
| 4 | 3.4169 | 2.614 | 0.6746 |
| 5 | 3.7576 | 2.9494 | 0.7446 |
| Average | 3.0131 ± 0.4920 | 2.3215 ± 0.3992 | 0.7019 ± 0.0561 |
| Validation | 3.4976 | 2.7237 | 0.8212 |
References
- Meng, Z.; Zhang, M.; Guo, C.; Fan, Q.; Zhang, H.; Gao, N.; Zhang, Z. Recent Progress in Sensing and Computing Techniques for Human Activity Recognition and Motion Analysis. Electronics 2020, 9, 1357. [Google Scholar] [CrossRef]
- Suglia, V.; Palazzo, L.; Bevilacqua, V.; Passantino, A.; Pagano, G.; D’Addio, G. A Novel Framework Based on Deep Learning Architecture for Continuous Human Activity Recognition with Inertial Sensors. Sensors 2024, 24, 2199. [Google Scholar] [CrossRef]
- Palazzo, L.; Suglia, V.; Grieco, S.; Buongiorno, D.; Brunetti, A.; Carnimeo, L.; Amitrano, F.; Coccia, A.; Pagano, G.; D’Addio, G.; et al. A Deep Learning-Based Framework Oriented to Pathological Gait Recognition with Inertial Sensors. Sensors 2025, 25, 260. [Google Scholar] [CrossRef]
- Kamanditya, B.; Fuadah, Y.N.; Mahardika, T.N.Q.; Lim, K.M. Continuous Blood Pressure Prediction System Using Conv-LSTM Network on Hybrid Latent Features of Photoplethysmogram (PPG) and Electrocardiogram (ECG) Signals. Sci. Rep. 2024, 14, 16450. [Google Scholar] [CrossRef]
- Mohammadi, H.; Tarvirdizadeh, B.; Alipour, K.; Ghamari, M. Cuff-Less Blood Pressure Monitoring via PPG Signals Using a Hybrid CNN-BiLSTM Deep Learning Model with Attention Mechanism. Sci. Rep. 2025, 15, 22229. [Google Scholar] [CrossRef]
- Jilek, J. Electronic Sphygmomanometers: The Problems and Some Suggestions. Biomed. Instrum. Technol. 2003, 37, 231–233. [Google Scholar]
- Kirkendall, W.M.; Burton, A.C.; Epstein, F.H.; Freis, E.D. Recommendations for Human Blood Pressure Determination by Sphygmomanometers. Circulation 1967, 36, 980–988. [Google Scholar] [CrossRef]
- Shen, L.; Wang, Z.; Su, Z.; Qiu, S.; Xu, J.; Zhou, Y.; Yan, A.; Yin, R.; Lu, B.; Nie, X.; et al. Effects of Intracranial Pressure Monitoring on Mortality in Patients with Severe Traumatic Brain Injury: A Meta-Analysis. PLoS ONE 2016, 11, e0168901. [Google Scholar] [CrossRef]
- Zhang, X.; Medow, J.E.; Iskandar, B.J.; Wang, F.; Shokoueinejad, M.; Koueik, J.; Webster, J.G. Invasive and Noninvasive Means of Measuring Intracranial Pressure: A Review. Physiol. Meas. 2017, 38, R143. [Google Scholar] [CrossRef]
- Rangel-Castillo, L.; Gopinath, S.; Robertson, C.S. Management of Intracranial Hypertension. Neurol. Clin. 2008, 26, 521–541. [Google Scholar] [CrossRef]
- Nair, S.S.; Guo, A.; Boen, J.; Aggarwal, A.; Chahal, O.; Tandon, A.; Patel, M.; Sankararaman, S.; Stevens, R.D. A Real-Time Deep Learning Approach for Inferring Intracranial Pressure from Routinely Measured Extracranial Waveforms in the Intensive Care Unit. medRxiv 2023. medRxiv:2023.05.16.23289747. [Google Scholar] [CrossRef]
- Hartmann, V.; Liu, H.; Chen, F.; Qiu, Q.; Hughes, S.; Zheng, D. Quantitative Comparison of Photoplethysmographic Waveform Characteristics: Effect of Measurement Site. Front. Physiol. 2019, 10, 198. [Google Scholar] [CrossRef]
- Sharkey, E.J.; Di Maria, C.; Klinge, A.; Murray, A.; Zheng, D.; O’Sullivan, J.; Allen, J. Innovative Multi-Site Photoplethysmography Measurement and Analysis Demonstrating Increased Arterial Stiffness in Paediatric Heart Transplant Recipients. Physiol. Meas. 2018, 39, 074007. [Google Scholar] [CrossRef]
- Sun, Y.; Thakor, N. Photoplethysmography Revisited: From Contact to Noncontact, From Point to Imaging. IEEE Trans. Biomed. Eng. 2016, 63, 463–477. [Google Scholar] [CrossRef]
- Ezzat, A.; Omer, O.A.; Mohamed, U.S.; Mubarak, A.S. ECG Signal Reconstruction from PPG Using a Hybrid Attention-Based Deep Learning Network. EURASIP J. Adv. Signal Process. 2024, 2024, 95. [Google Scholar] [CrossRef]
- Ghadekar, P.; Bongulwar, A.; Jadhav, A.; Ahire, R.; Dumbre, A.; Ali, S. Bi-LSTM Based Interdependent Prediction of Physiological Signals. In Proceedings of the 2023 International Conference on Emerging Smart Computing and Informatics (ESCI), Pune, India, 1–3 March 2023; pp. 1–6. [Google Scholar]
- Cui, M.; Dong, X.; Zhuang, Y.; Li, S.; Yin, S.; Chen, Z.; Liang, Y. ACNN-BiLSTM: A Deep Learning Approach for Continuous Noninvasive Blood Pressure Measurement Using Multi-Wavelength PPG Fusion. Bioengineering 2024, 11, 306. [Google Scholar] [CrossRef] [PubMed]
- Vo, K.; El-Khamy, M.; Choi, Y. PPG-to-ECG Signal Translation for Continuous Atrial Fibrillation Detection via Attention-Based Deep State-Space Modeling. In Proceedings of the 2024 46th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Orlando, FL, USA, 15–19 July 2024; pp. 1–7. [Google Scholar]
- Zhu, R.; Pan, W.; Liu, J.; Shang, J. Epileptic Seizure Prediction via Multidimensional Transformer and Recurrent Neural Network Fusion. J. Transl. Med. 2024, 22, 895. [Google Scholar] [CrossRef]
- Cheng, J.; Xu, Y.; Song, R.; Liu, Y.; Li, C.; Chen, X. Prediction of Arterial Blood Pressure Waveforms from Photoplethysmogram Signals via Fully Convolutional Neural Networks. Comput. Biol. Med. 2021, 138, 104877. [Google Scholar] [CrossRef]
- Mahardika, T.N.Q.; Fuadah, Y.N.; Jeong, D.U.; Lim, K.M. PPG Signals-Based Blood-Pressure Estimation Using Grid Search in Hyperparameter Optimization of CNN–LSTM. Diagnostics 2023, 13, 2566. [Google Scholar] [CrossRef]
- Chu, Y.; Tang, K.; Hsu, Y.-C.; Huang, T.; Wang, D.; Li, W.; Savitz, S.I.; Jiang, X.; Shams, S. Non-Invasive Arterial Blood Pressure Measurement and SpO2 Estimation Using PPG Signal: A Deep Learning Framework. BMC Med. Inf. Decis. Mak. 2023, 23, 131. [Google Scholar] [CrossRef] [PubMed]
- Evensen, K.B.; O’Rourke, M.; Prieur, F.; Holm, S.; Eide, P.K. Non-Invasive Estimation of the Intracranial Pressure Waveform from the Central Arterial Blood Pressure Waveform in Idiopathic Normal Pressure Hydrocephalus Patients. Sci. Rep. 2018, 8, 4714. [Google Scholar] [CrossRef] [PubMed]
- Najia, M.; Faouzi, B. An Enhanced Hybrid Model Combining CNN, BiLSTM, and Attention Mechanism for ECG Segment Classification. Biomed. Eng. Comput. Biol. 2025, 16, 11795972251341051. [Google Scholar] [CrossRef]
- Dai, W.; Han, H.; Wang, J.; Xiao, X.; Li, D.; Chen, C.; Wang, L. Enhanced CNN-BiLSTM-Attention Model for High-Precision Integrated Navigation During GNSS Outages. Remote Sens. 2025, 17, 1542. [Google Scholar] [CrossRef]
- Schlesinger, O.; Vigderhouse, N.; Eytan, D.; Moshe, Y. Blood Pressure Estimation From PPG Signals Using Convolutional Neural Networks And Siamese Network. In Proceedings of the ICASSP 2020—2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, 4–8 May 2020; pp. 1135–1139. [Google Scholar]
- Wang, C.; Yang, F.; Yuan, X.; Zhang, Y.; Chang, K.; Li, Z. An End-to-End Neural Network Model for Blood Pressure Estimation Using PPG Signal. In Artificial Intelligence in China, Proceedings of the International Conference on Artificial Intelligence in China; Liang, Q., Wang, W., Mu, J., Liu, X., Na, Z., Chen, B., Eds.; Springer: Singapore, 2020; pp. 262–272. [Google Scholar]
- Wang, M.; Ji, X. Design of Deep Learning-Based PPG Blood Pressure Prediction Model. In Proceedings of the 2024 5th International Conference on Big Data & Artificial Intelligence & Software Engineering (ICBASE), Wenzhou, China, 20–22 September 2024; pp. 414–418. [Google Scholar]
- Zhang, Y.; Ren, X.; Liang, X.; Ye, X.; Zhou, C. A Refined Blood Pressure Estimation Model Based on Single Channel Photoplethysmography. IEEE J. Biomed. Health Inform. 2022, 26, 5907–5917. [Google Scholar] [CrossRef]
- Athaya, T.; Choi, S. An Estimation Method of Continuous Non-Invasive Arterial Blood Pressure Waveform Using Photoplethysmography: A U-Net Architecture-Based Approach. Sensors 2021, 21, 1867. [Google Scholar] [CrossRef]
- Lei, X.; Pan, F.; Liu, H.; He, P.; Zheng, D.; Feng, J. An End-to-End Deep Learning Framework for Accurate Estimation of Intracranial Pressure Waveform Characteristics. Eng. Appl. Artif. Intell. 2024, 130, 107686. [Google Scholar] [CrossRef]
- Ma, C.; Zhang, P.; Song, F.; Sun, Y.; Fan, G.; Zhang, T.; Feng, Y.; Zhang, G. KD-Informer: A Cuff-Less Continuous Blood Pressure Waveform Estimation Approach Based on Single Photoplethysmography. IEEE J. Biomed. Health Inform. 2023, 27, 2219–2230. [Google Scholar] [CrossRef] [PubMed]
- Ma, C.; Zhang, P.; Zhang, H.; Liu, Z.; Song, F.; He, Y.; Zhang, G. STP: Self-Supervised Transfer Learning Based on Transformer for Noninvasive Blood Pressure Estimation Using Photoplethysmography. Expert. Syst. Appl. 2024, 249, 123809. [Google Scholar] [CrossRef]
- Panwar, M.; Gautam, A.; Biswas, D.; Acharyya, A. PP-Net: A Deep Learning Framework for PPG-Based Blood Pressure and Heart Rate Estimation. IEEE Sens. J. 2020, 20, 10000–10011. [Google Scholar] [CrossRef]
- Jeong, D.U.; Lim, K.M. Combined Deep CNN–LSTM Network-Based Multitasking Learning Architecture for Noninvasive Continuous Blood Pressure Estimation Using Difference in ECG-PPG Features. Sci. Rep. 2021, 11, 13539. [Google Scholar] [CrossRef] [PubMed]
- Zhang, G.; Choi, D.; Shin, S.; Jung, J. Cuff-Less Blood Pressure Estimation from ECG and PPG Using CNN-LSTM Algorithms. In Proceedings of the 2023 IEEE 2nd International Conference on AI in Cybersecurity (ICAIC), Houston, TX, USA, 7–9 February 2023; pp. 1–4. [Google Scholar]
- Leitner, J.; Chiang, P.-H.; Dey, S. Personalized Blood Pressure Estimation Using Photoplethysmography: A Transfer Learning Approach. IEEE J. Biomed. Health Inform. 2022, 26, 218–228. [Google Scholar] [CrossRef]
- Shaikh, M.R.; Forouzanfar, M. Dual-Stream CNN-LSTM Architecture for Cuffless Blood Pressure Estimation From PPG and ECG Signals: A PulseDB Study. IEEE Sens. J. 2025, 25, 4006–4014. [Google Scholar] [CrossRef]
- El-Hajj, C.; Kyriacou, P.A. Cuffless Blood Pressure Estimation from PPG Signals and Its Derivatives Using Deep Learning Models. Biomed. Signal Process. Control 2021, 70, 102984. [Google Scholar] [CrossRef]
- Aguirre, N.; Grall-Maës, E.; Cymberknop, L.J.; Armentano, R.L. Blood Pressure Morphology Assessment from Photoplethysmogram and Demographic Information Using Deep Learning with Attention Mechanism. Sensors 2021, 21, 2167. [Google Scholar] [CrossRef] [PubMed]
- Xiang, T.; Jin, Y.; Liu, Z.; Clifton, L.; Clifton, D.A.; Zhang, Y.; Zhang, Q.; Ji, N.; Zhang, Y. Dynamic Beat-to-Beat Measurements of Blood Pressure Using Multimodal Physiological Signals and a Hybrid CNN-LSTM Model. IEEE J. Biomed. Health Inform. 2025, 29, 5438–5451. [Google Scholar] [CrossRef] [PubMed]
- Tian, Z.; Liu, A.; Zhu, G.; Chen, X. A Paralleled CNN and Transformer Network for PPG-Based Cuff-Less Blood Pressure Estimation. Biomed. Signal Process. Control 2025, 99, 106741. [Google Scholar] [CrossRef]
- Zhang, G.; Choi, D.; Jung, J. Development of Continuous Cuffless Blood Pressure Prediction Platform Using Enhanced 1-D SENet–LSTM. Expert Syst. Appl. 2024, 242, 122812. [Google Scholar] [CrossRef]
- Rastegar, S.; Gholam Hosseini, H.; Lowe, A. Hybrid CNN-SVR Blood Pressure Estimation Model Using ECG and PPG Signals. Sensors 2023, 23, 1259. [Google Scholar] [CrossRef] [PubMed]
- Le, T.; Ellington, F.; Lee, T.-Y.; Vo, K.; Khine, M.; Krishnan, S.K.; Dutt, N.; Cao, H. Continuous Non-Invasive Blood Pressure Monitoring: A Methodological Review on Measurement Techniques. IEEE Access 2020, 8, 212478–212498. [Google Scholar] [CrossRef]
- Pilz, N.; Patzak, A.; Bothe, T.L. Continuous Cuffless and Non-Invasive Measurement of Arterial Blood Pressure—Concepts and Future Perspectives. Blood Press. 2022, 31, 254–269. [Google Scholar] [CrossRef]
- Mejía-Mejía, E.; May, J.M.; Elgendi, M.; Kyriacou, P.A. Classification of Blood Pressure in Critically Ill Patients Using Photoplethysmography and Machine Learning. Comput. Methods Programs Biomed. 2021, 208, 106222. [Google Scholar] [CrossRef]
- Ibrahim, B.; Kireev, D.; Sel, K.; Kumar, N.; Akbari, A.; Jafari, R.; Akinwande, D. Continuous Cuffless Monitoring of Arterial Blood Pressure via Graphene Bioimpedance Tattoos. Nat. Nanotechnol. 2022, 17, 864–870. [Google Scholar] [CrossRef]
- Kim, N.; Krasner, A.; Kosinski, C.; Wininger, M.; Qadri, M.; Kappus, Z.; Danish, S.; Craelius, W. Trending Autoregulatory Indices during Treatment for Traumatic Brain Injury. J. Clin. Monit. Comput. 2016, 30, 821–831. [Google Scholar] [CrossRef]
- Lewis, P.M.; Rosenfeld, J.V.; Diehl, R.R.; Mehdorn, H.M.; Lang, E.W. Phase Shift and Correlation Coefficient Measurement of Cerebral Autoregulation during Deep Breathing in Traumatic Brain Injury (TBI). Acta Neurochir. 2008, 150, 139–147. [Google Scholar] [CrossRef]
- Fanelli, A.; Vonberg, F.W.; LaRovere, K.L.; Walsh, B.K.; Smith, E.R.; Robinson, S.; Tasker, R.C.; Heldt, T. Fully Automated, Real-Time, Calibration-Free, Continuous Noninvasive Estimation of Intracranial Pressure in Children. J. Neurosurg. Pediatr. 2019, 24, 509–519. [Google Scholar] [CrossRef]
- Steinmeier, R.; Bauhuf, C.; Hübner, U.; Bauer, R.D.; Fahlbusch, R.; Laumer, R.; Bondar, I. Slow Rhythmic Oscillations of Blood Pressure, Intracranial Pressure, Microcirculation, and Cerebral Oxygenation. Stroke 1996, 27, 2236–2243. [Google Scholar] [CrossRef] [PubMed]
- Steinmeier, R.; Bauhuf, C.; Hübner, U.; Hofmann, R.P.; Fahlbusch, R. Continuous Cerebral Autoregulation Monitoring by Cross-Correlation Analysis: Evaluation in Healthy Volunteers. Crit. Care Med. 2002, 30, 1969. [Google Scholar] [CrossRef]
- Karimi, D.; Dou, H.; Warfield, S.K.; Gholipour, A. Deep Learning with Noisy Labels: Exploring Techniques and Remedies in Medical Image Analysis. Med. Image Anal. 2020, 65, 101759. [Google Scholar] [CrossRef] [PubMed]
- Wei, Y.; Deng, Y.; Sun, C.; Lin, M.; Jiang, H.; Peng, Y. Deep Learning with Noisy Labels in Medical Prediction Problems: A Scoping Review. J. Am. Med. Inform. Assoc. JAMIA 2024, 31, 1596. [Google Scholar] [CrossRef] [PubMed]
- Roldan, M.; Abay, T.Y.; Uff, C.; Kyriacou, P.A. A Pilot Clinical Study to Estimate Intracranial Pressure Utilising Cerebral Photoplethysmograms in Traumatic Brain Injury Patients. Acta Neurochir. 2024, 166, 109. [Google Scholar] [CrossRef] [PubMed]
- Ye, G.; Balasubramanian, V.; Li, J.K.-J.; Kaya, M. Machine Learning-Based Continuous Intracranial Pressure Prediction for Traumatic Injury Patients. IEEE J. Transl. Eng. Health Med. 2022, 10, 4901008. [Google Scholar] [CrossRef] [PubMed]
- Ibtehaz, N.; Mahmud, S.; Chowdhury, M.E.H.; Khandakar, A.; Salman Khan, M.; Ayari, M.A.; Tahir, A.M.; Rahman, M.S. PPG2ABP: Translating Photoplethysmogram (PPG) Signals to Arterial Blood Pressure (ABP) Waveforms. Bioengineering 2022, 9, 692. [Google Scholar] [CrossRef] [PubMed]
- Kashif, F.M.; Verghese, G.C.; Novak, V.; Czosnyka, M.; Heldt, T. Model-Based Noninvasive Estimation of Intracranial Pressure from Cerebral Blood Flow Velocity and Arterial Pressure. Sci. Transl. Med. 2012, 4, 129ra44. [Google Scholar] [CrossRef] [PubMed]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).