1. Introduction
Blood pressure (BP) is one of the most important parameters for monitoring the status of the body and can be used in the diagnosis and treatment of many diseases. Unfortunately, in recent years, the number of hypertension (HPT) patients continued to increase and became an urgent global health problem [
1]. HPT could increase the pressure of blood on the walls of the arteries, which gradually affects several organs, including the heart, brain, and kidneys. HPT even has the potential to cause the blockage or rupture of blood vessels that supply oxygen to the brain, which greatly increases the risk of accidents [
2,
3,
4]. BP regulation is influenced by a variety of factors, such as cardiac ejection, peripheral resistance, and vessel wall elasticity, and also has a relation with mood and age. As HPT is a chronic disease that requires long-term monitoring, a single BP measurement is not able to provide an accurate view of the body’s condition. Therefore, it is very important to propose a noninvasive continuous blood pressure estimation method, which can reflect the physical condition of the human body in real time and reduce the incidence of cardiovascular diseases [
5].
The mainstream cuffless BP measurement methods analyze the features of physiological signals such as photoplethysmogram (PPG) [
6], electrocardiogram (ECG) [
7], ballistocardiogram (BCG) [
8], etc., and predict BP by machine learning (ML). The key technologies are physiological signals acquisition [
9] and feature extraction [
10,
11]. In the last decade, with the development of semiconductor technology and the popularity of wearable health monitoring devices, people were found to use such devices to achieve a variety of physiological signal acquisition without interfering with normal human activities. The quality of signals directly affects the accuracy of prediction. This poses a challenge to the design of acquisition hardware but also makes continuous BP monitoring possible. In [
12], Rachim et al. designed a multimodal biosensor to measure PPG and impedance plethysmography (IPG) from the participant’s wrist. Then, 14 PTT-features were calculated between the IPG peak-point and PPG. The comparative experiment showed that the extracted PTT-features had a certain correlation with BP, but its correlation coefficient for SBP was weaker than that of conventional PAT, which was located by ECG. In addition, compared with the fingertip, the wrist was more active and had a bad influence on the collected PPG signals. Therefore, the author only analyzed the time characteristics and did not use other amplitude characteristics. Bui et al. [
13] chose the ear as the measuring location to minimize motion interference and ensure comfort. Benefiting from the good contact between the ear and the sensor, the author collected the PPG signal with high quality. However, because of the lack of ECG as a reference, the features provided by a single PPG are still limited. Multiple types of signals are conducive to obtaining richer features and improving the accuracy of results. Hence, researchers must consider comfort, signal diversity, and quality while using wearable devices to collect physiological signals.
For feature extraction, scholars try to synthesize multiple physiological signals and select information in multiple dimensions such as the time domain and frequency domain. Among them, pulse transit time (PTT) [
14], pulse arrival time (PAT) [
15], pulse wave velocity (PWV) [
16], etc. were proven to be effective. Chen W et al. [
17] adopted that PTT was highly correlated with the high-frequency components of BP. Through calibration, more accurate blood pressure estimation can be achieved. Meanwhile, age, gender, cardiovascular disease, and other factors determine arterial stiffness, which is significantly correlated with PTT [
18]. Zhang et al. [
19] considered how the autonomic nervous system affects the heart and blood vessels. He extracted nine features related to heart rate variability (HRV) and combined them with other PPG features. The result proved the effectiveness of HRV features. Geerthy et al. [
20] extracted informative features such as SDI, Womersley, and QRS from PPG and ECG signals and used a genetic algorithm (GA) to select features. The best optimal feature set reduced the mean absolute error (MAE) from 13.20 mmHg to 9.54 mmHg for systolic BP (SBP) and 9.91 mmHg to 5.48 mmHg for diastolic BP (DBP), respectively. Wang et al. [
21] collected 30 sets of PPG and ECG signals by Finometer. Then, 39 features were extracted and 10 features were finally retained by comparing their correlations with BP. In [
22], 32 features from PPG were extracted to estimate BP. These features almost comprehensively contained multiple scale information such as amplitude, time, and frequency. Shuo et al. [
23] introduced the mean impact value (MIV) to investigate the impact of each feature and the genetic algorithm (GA) to implement parameter optimization. After optimization, the MAE of SBP and DBP reached 3.27 mmHg and 1.16 mmHg. Through the joint analysis of the PPG signal and its derivatives, Chowdhury et al. [
24] extracted up to 101 features, among which 75 features were from the time domain, 16 were from the frequency domain, and the remaining 20 were computed using statistical analysis. It was observed that the number of features extracted from the signal increased, which not only aggravated the workload of feature extraction but also made model training more difficult. Compared with the feature extraction itself, how to screen out the optimal solution from many features became another issue for scholars to research.
Different from traditional ML methods, deep learning (DL) methods avoid the drawbacks of manual feature extraction. It can automatically learn more abstract and high-dimensional features from the input signal, which enables it to have stronger adaptability for nonlinear system complexity [
11]. Benefiting from its powerful data mining capabilities, machine learning can be combined with biological signal processing and achieved good performance [
25]. Yu et al. [
26] introduced an attention-based residual block to U-Net to predict BP. The result showed that the combination of PPG raw signal, first derivative, and second derivative as model input was helpful for the network to extract more information. Wang et al. [
27] proposed an end-to-end model to measure BP. The model consisted of one-dimensional convolutional layers, depth-separable convolutional layers, and a gated recurrent unit (GRU). The average absolute error (MAE) was 3.95 mmHg for SBP and 2.14 mmHg for DBP, which met the international standard. Senturk et al. [
28] analyzed the performance of dynamic learning methods, such as recurrent neural networks (RNN), nonlinear autoregressive networks with exogenous inputs neural networks (NARX-NN), and long short-term memory neural networks (LSTM-NN). The experiment showed that NARX had the most potential.
Despite the great performance in BP prediction, DL still has the following shortcomings. On the one hand, it relies on a large amount of data during the training process, and it is difficult to achieve good performance on small batches of data. On the other hand, neural network features are not easily understood at the semantic level, which makes it more challenging for scholars to interpret the experimental results.
Above all, as ECG and PPG signals have irreplaceable effects on BP assessment, we designed a wearable physiological signal acquisition device that could continuously collect and transmit two signals in real time. In order to make the acquisition process comfortable, we adopted a separate design for the host and the sensor. Meanwhile, combining individual features with waveform features, a multi-parameter fusion method for non-invasive continuous BP estimation was proposed and mutual information (MI) [
29] was introduced for feature selection. The main contributions in this paper are as follows:
- (1)
We used a split design where the acquisition host and sensor were connected via a type-C interface. This not only ensured the quality of the collected signal but also facilitated the replacement of the sensor. The host can either transmit data wirelessly in real time or store data locally as a backup. The whole host size was only 42 mm × 29 mm × 13 mm and caused a little burden on the human body.
- (2)
We constructed a private database by self-collected data. Then, the model was trained on the MIMIC III dataset [
30] (accessed on 10 January 2020) and tested on the private dataset, which avoided data leakage. After training, we calibrated the model with a quarter of the records in the testing set.
- (3)
Gaussian copula MI (GCMI) was used to rank the initial 25 features. Then, 11 and 15 features were retained for SBP and DBP prediction, respectively. The results showed that the optimal feature set improved performance.
3. Proposed Method
In this section, we will introduce the process of feature extraction, selection, and BP prediction.
Figure 4 demonstrates the block diagram of the proposed method.
3.1. Feature Extraction
We extracted 3 individual features and 22 waveform features.
Table 1 lists the abbreviation and definition of features.
It was proved that age and gender were correlated with arterial stiffness and PWV [
35,
36], and so, we took individual information into account. During the collection process, we recorded the gender, age, and weight information of each volunteer. Meanwhile, by querying the MIMIC-III clinical database, this information can also be obtained.
- 2.
PAT
PAT is defined as the time required for blood to flow from the beginning of the electrical activation of the heart to the distal point [
15]. As it contains the PTT feature, PAT is often used as a critical parameter for BP estimation [
14]. Generally, the R- peak of the ECG signal is taken as the starting point, and the feature points of the PPG signal are taken as the ending point to calculate PAT. We extracted PAT
p, PAT
f, and PAT
d, which means the distance from the R-peak of the ECG signal to the peak, the valley, and the maximum slope point of the PPG signal, which is shown in
Figure 3.
- 3.
Other time-related features
Heart rate (
HR):
HR reflects the heart cycle and can be calculated by the
RR interval of the ECG signal. Cardiac output can be correlated with PTT through HR, and so, there is also a correlation between heart rate and blood pressure. HR can be calculated by the following formula:
The peak-to-peak interval of the PPG signal (PP): PP is the time interval between two peaks. Studies found that participants with HPT or arteriosclerosis had “longer PP interval” than healthy participants [
37].
The PPG waveform consists of ascending branches and descending branches. The ascending branch time (AT) is the time from the valley to the peak of the PPG signal, which was proven to be a useful feature for classifying the PWV. The descending branch time (DT) is the time from the peak to the next valley of the PPG signal.
Peak-to-peak time (PPT) is defined as the time between the first peak and the second peak or inflection point of the PPG signal. Its definition depends on the contour of the PPG waveform [
38]. The second peak or inflection point is generated by reflection waves, which is related to the time required for the PPG signal to transit from the heart to the peripheral and return, so it can be used to evaluate the artery’s stiffness and the PWV.
- 4.
Intensity-related features
From the perspective of the formation of BP, it is mainly affected by five factors: cardiac output, peripheral resistance, arterial wall elasticity, circulating blood volume, and blood volume ratio. PPG intensity ratio (PIR) is related to changes in arterial diameter, which is the main cause of peripheral resistance and blood volume. PIR can be used to evaluate the smooth muscle tension that regulates arterial blood pressure in the low-frequency range and improve the accuracy of blood pressure estimation [
39,
40]. We selected the ratio of peak intensity to valley intensity (PIRp) and the ratio of maximum slope point intensity to valley intensity (PIRmd), which can be calculated by the Formula (2).
where
PIRx is the PPG intensity ratio,
PIx is the PPG intensity of the peak or the maximum slope point, and
PIν is the PPG intensity of the valley.
To reflect the change in the PPG intensity, we also added the statistical features of the PPG intensity, including the average value of the PPG intensity (PIavg), the standard deviation of the PPG intensity (PIsd), the maximum value of the PPG intensity (PImax), the minimum value of the PPG intensity (PImin).
- 5.
K value
K value can reflect the physiological factors of the human cardiovascular system, such as vascular peripheral resistance, vascular wall elasticity, and blood viscosity. It is an important physiological index for the clinical examination of cardiovascular disease and has important clinical application value. It was calculated according to Formula (3):
where
Pm is the average intensity of the PPG signal in one cardiac cycle, which was calculated by Formula (4):
PI(t) is the intensity of the PPG signal at time t, and T is the time of one cycle of the PPG signal.
- 6.
Other waveform features
The slope of ascending branch and descending branch can be calculated by PIp, AT, and DT. The area of ascending branch and descending branch can be obtained by integrating the PPG intensity. We added seven features to the model, including the ascending slope of the PPG signal (AS), the descending slope of the PPG signal (DS), the sum of ascending branch value of the PPG signal (SA), the average of ascending branch value of the PPG signal (AA), the sum of descending branch value of the PPG signal (SD), the average of descending branch value of the PPG signal (AD), and the sum ratio of the ascending branch value to the descending branch value of the PPG signal (SR).
3.2. Regression Model
Random forest (RF) [
41] uses decision trees as weak learners and combines multiple decision trees to make predictions through random sampling with replacement. It can be used in classification and regression tasks, in which the final result is obtained by voting in classification and the average value is taken as the final result in regression. Even if there is a non-linear relationship between the input and label, it can still maintain good performance. Each decision tree constructed is different to reduce the deviation and variance of the prediction results. Because of the multiple combinations of the prediction results, RF is not sensitive to outliers and has a better ability for anti-overfitting and stability.
Compared with random search, grid search is time-consuming and easy to result in dimension disaster, and so, we used the bootstrap method to train the random forest. The number of decision trees was set to 100, the maximum depth of trees was set to 50, the min_samples_split was set to 2, and the min_samples_leaf was set to 1.
3.3. GCMI
MI is a measure of the mutual dependence between two random variables [
42]. It measures the degree of information about one variable that is learned through observing the other [
43]. It is a non-negative value, with higher values indicating stronger dependence between the variables. When two random variables follow the Gaussian distribution and X is a multidimensional vector, the calculated MI is the GCMI [
29]. It can be obtained by the following formula:
where
and
are the covariance matrices of
X and
Y, respectively, and
is the covariance matrix of the joint variables (
X,
Y). The detailed feature selection process is shown in Algorithm 1. First, each feature is calculated GCMI with SBP and DBP. If the value is zero, the corresponding feature will be removed. Due to the redundancy between features, the GCMI of the combined group may not necessarily be the highest. Therefore, we took the approach of deleting features and gradually eliminating the features that contributed the least to the group. Meanwhile, we set a threshold. When the minimum MI loss of the group exceeded the threshold after eliminating a feature, the feature selection process was stopped. The rest features were the filtered feature set.
Algorithm 1. GCMI-based feature selection method. |
}, the label C |
Output: Selected features F′ |
Steps: |
(1) For i = 1 to n, do |
(2) |
(3) end |
(4) Repeat |
(5) Calculate I1 = I (F′, C) |
(6) For i = 1 to n, do |
(7) , Calculate I2 = I (F2, C) |
(8) end |
(9) , then |
(10) |min(I2) |
(11) Until I1 − min (I2) > 0.002 |
(12) Return F′ |