A Non-Invasive Continuous Blood Pressure Estimation Approach Based on Machine Learning

Considering the existing issues of traditional blood pressure (BP) measurement methods and non-invasive continuous BP measurement techniques, this study aims to establish the systolic BP and diastolic BP estimation models based on machine learning using pulse transit time and characteristics of pulse waveform. In the process of model construction, the mean impact value method was introduced to investigate the impact of each feature on the models and the genetic algorithm was introduced to implement parameter optimization. The experimental results showed that the proposed models could effectively describe the nonlinear relationship between the features and BP and had higher accuracy than the traditional methods with the error of 3.27 ± 5.52 mmHg for systolic BP and 1.16 ± 1.97 mmHg for diastolic BP. Moreover, the estimation errors met the requirements of the Advancement of Medical Instrumentation and British Hypertension Society criteria. In conclusion, this study was helpful in promoting the practical application of methods for non-invasive continuous BP estimation models.


Introduction
According to the World Health Statistics 2018 published by the World Health Organization, the number of deaths caused by cardiovascular diseases (CVDs) accounted for 44% of deaths due to noncommunicable disease (NCD) in 2016 globally, which is as high as 17.90 million deaths [1]. Hypertension is one of the strongest risk factors for CVDs. About 50%-75% of strokes and 40%-50% of myocardial infarctions are associated with elevated blood pressure (BP) [2]. As an essential parameter monitoring the heart and vascular functions of the human body, BP is highly important for the prevention and diagnosis of CVDs.
However, BP may fluctuate and is closely correlated with target organ damages in hypertension [3]. Emotional fluctuations, strenuous exercise, and irrational use of medicines can cause changes in BP [4]. Traditional clinic-based measurements often fail to evaluate patients' true BP objectively because of "white-coat hypertension" or "masked hypertension" [5]. Besides, the clinically commonly used oscillometric-based measurement technologies, such as mercury sphygmomanometer, electronic sphygmomanometer, and dynamic sphygmomanometer, can only intermittently measure BP and are not able to provide continuous long-term BP data [6]. However, the continuous BP measurement method, such as the direct BP measurement, overcomes the drawbacks of traditional measurement methods. It can complete BP measurement in each cardiac cycle and monitor BP changes more accurately. Therefore, it is recognized as the "gold standard" for BP monitoring internationally [7].
analyzed. Then, the BP estimation models were built by machine learning based on the features of dimensionality reduction, and the model parameters were optimized by optimization algorithms. In the end, whether the present models have higher estimation accuracies than the traditional multiple linear regression models and the PTT-based BP estimation models was explored, and conclusions were reached according to AAMI and the British Hypertension Society (BHS) criteria.

Data Collection
The data in this study were retrieved from the MIMIC-III Waveform Database, a multiparameter critical care database open to the public at the Massachusetts Institute of Technology (MA, USA) [26]. The MIMIC-III Waveform Database Matched Subset contained 22,317 waveform records and 22,247 numeric records from patients; some of the waveform records provided biosignals, including ECG, PPG, and ABP signals, captured simultaneously by monitoring beds in the intensive care unit. The database not only ensured data diversity but also offered ABP signals as a standard comparison, providing a solid data foundation for the construction of BP estimation models. In this study, a total of 772 sets of waveform data were acquired online using the WFDB Toolbox in which the function RDSAMP allowed users to load PhysioNet waveform data into MATLAB's workspace [27]. All experiments in this study were based on the MATLAB platform. The distribution of BP in the experimental dataset is shown in Figure 1. The plots demonstrated that the data were widely distributed, covering from low BP to high BP values, and could be used to effectively evaluate the predictive power of the models. based on the features of dimensionality reduction, and the model parameters were optimized by optimization algorithms. In the end, whether the present models have higher estimation accuracies than the traditional multiple linear regression models and the PTT-based BP estimation models was explored, and conclusions were reached according to AAMI and the British Hypertension Society (BHS) criteria.

Data Collection
The data in this study were retrieved from the MIMIC-III Waveform Database, a multiparameter critical care database open to the public at the Massachusetts Institute of Technology (MA, USA) [26]. The MIMIC-III Waveform Database Matched Subset contained 22,317 waveform records and 22,247 numeric records from patients; some of the waveform records provided biosignals, including ECG, PPG, and ABP signals, captured simultaneously by monitoring beds in the intensive care unit. The database not only ensured data diversity but also offered ABP signals as a standard comparison, providing a solid data foundation for the construction of BP estimation models. In this study, a total of 772 sets of waveform data were acquired online using the WFDB Toolbox in which the function RDSAMP allowed users to load PhysioNet waveform data into MATLAB's workspace [27]. All experiments in this study were based on the MATLAB platform. The distribution of BP in the experimental dataset is shown in Figure 1. The plots demonstrated that the data were widely distributed, covering from low BP to high BP values, and could be used to effectively evaluate the predictive power of the models.

Preprocessing and Feature Extraction
The ECG and PPG signals provided by the MIMIC-III Waveform Database were weak electrical signals, susceptible to myoelectric interference and baseline drift during signal collection, and affected the accuracy of signal feature extraction. Therefore, to maximize the useful information of the original signals, after removing the data segments with irregularities and missing waveforms in the database, the wavelet threshold denoising method and the cubic spline interpolation method were used, respectively, to denoise the ECG and PPG signals [28].
To construct a dataset for the BP estimation models, it was necessary to accurately extract the features of the original signals and select effective features, improving the generalization and reducing overfitting. In this study, 14 BP-related features, including PTT, heart rate (HR), and characteristics of pulse waveform, were selected, as shown in Figure 2. The detailed relationships and definitions are as follows:

Preprocessing and Feature Extraction
The ECG and PPG signals provided by the MIMIC-III Waveform Database were weak electrical signals, susceptible to myoelectric interference and baseline drift during signal collection, and affected the accuracy of signal feature extraction. Therefore, to maximize the useful information of the original signals, after removing the data segments with irregularities and missing waveforms in the database, the wavelet threshold denoising method and the cubic spline interpolation method were used, respectively, to denoise the ECG and PPG signals [28].
To construct a dataset for the BP estimation models, it was necessary to accurately extract the features of the original signals and select effective features, improving the generalization and reducing overfitting. In this study, 14 BP-related features, including PTT, heart rate (HR), and characteristics of pulse waveform, were selected, as shown in Figure 2. The detailed relationships and definitions are as follows:

PTTx Features
In 1979, Hughes proposed a theoretical model describing the relationship between BP and Young's modulus of elasticity of aorta, and defined as follows [29]: where E0 is the Young's modulus for zero pressure, γ is the vessel characteristic parameter, and P is the BP. In arteries, the relationship between the elastic modulus of the arterial wall and the pulse wave velocity (PWV) can be expressed by the Moens-Korteweg equation: where D is the pulse transit distance, K is the Moens constant, t is the arterial wall thickness, d is the inner diameter of arteries in equilibrium, and ρ is the blood density.
Equations (1) and (2) together demonstrated a close correlation between PTT and BP. In this study, the PTT features were obtained by calculating the time from the R-peak of the ECG signal to the corresponding pulse feature point in each cardiac cycle. The following three PTT features were selected for this: the time from the R-peak to the pulse start point b (PTTb), the time from the R-peak to the point with maximum slope a (PTTa), and the time from the R-peak to the pulse peak point c (PTTc).

K Value
The K value is closely related to the peripheral resistance and the hardening degree of the arterial wall, and is one of the important physiological indicators for clinical research of CVDs [30]. The K value is dimensionless, related only to the area of the pulse wave, and defined as follows: where = ( ) .

PTTx Features
In 1979, Hughes proposed a theoretical model describing the relationship between BP and Young's modulus of elasticity of aorta, and defined as follows [29]: where E 0 is the Young's modulus for zero pressure, γ is the vessel characteristic parameter, and P is the BP. In arteries, the relationship between the elastic modulus of the arterial wall and the pulse wave velocity (PWV) can be expressed by the Moens-Korteweg equation: where D is the pulse transit distance, K is the Moens constant, t is the arterial wall thickness, d is the inner diameter of arteries in equilibrium, and ρ is the blood density.
Equations (1) and (2) together demonstrated a close correlation between PTT and BP. In this study, the PTT features were obtained by calculating the time from the R-peak of the ECG signal to the corresponding pulse feature point in each cardiac cycle. The following three PTT features were selected for this: the time from the R-peak to the pulse start point b (PTTb), the time from the R-peak to the point with maximum slope a (PTTa), and the time from the R-peak to the pulse peak point c (PTTc).

K Value
The K value is closely related to the peripheral resistance and the hardening degree of the arterial wall, and is one of the important physiological indicators for clinical research of CVDs [30]. The K value is dimensionless, related only to the area of the pulse wave, and defined as follows: where PPG m = 1 T PPG(t)dt.

HR
If HR is accelerated while holding the cardiac output and peripheral resistance constant, due to the shortened diastole, the blood flows to the peripheral blood vessels is reduced while the blood in the aorta is increased, resulting in increased DBP and SBP, and vice versa. The HR in this study was calculated using the R-R interval obtained by calibrating the R-peak of the ECG signal.

Characteristics of Pulse Waveform
The characteristics of pulse waveforms selected in this study included relative time for rising Tupr = Tup/T, relative time for falling Tdownr = Tdown/T, main wave rising slope Cslope = Hc/Tup, relative height of the maximum slope point Har = Ha/Hc, relative height of the minimum slope point Her = He/Hc, relative height of the dicrotic wave peak point Hgr = Hg/Hc, relative area of systole S1, relative area of diastole S2, and area ratio of systole to diastole S1/S2.

Normalization and Dimensionality Reduction of Features
For the preprocessed data, 80% of the data was randomly selected as the training set, and the remaining 20% was used as the test set. In machine learning, different feature vectors represent different evaluation indicators. These feature vectors usually have different dimensions; therefore, the data need to be normalized to improve the comparability among features. At the same time, normalization can reduce the adverse effects caused by outliers and speed up the gradient descent to find the optimal solution. In this study, the min-max normalization method was used to linearly transform the data to the range of [0, 1]. The mapping function is as follows: In this study, the training set and the test set were normalized altogether to effectively solve the inaccurate prediction problem due to the broad range of the training set and the narrow range of the test set.
To verify the validity of the features and remove redundant features, this study used the mean impact value (MIV) method to evaluate the importance of each feature to the BP estimation models and remove the features with small MIVs to achieve dimensionality reduction [31]. MIV could improve the efficiency of feature extraction while simplifying the model structure and upgrading the model performance. The sign of MIV represented the direction of correlation, and the absolute value represented the degree of the impact. In this study, two different feature combinations were selected as input to the SBP and DBP models, respectively. The detailed calculation process is as follows: (1) After finishing the model training, two new training sets X1 and X2 were generated by transforming each input variable value of a feature in the training set X by ±10%.
(2) The X1 and X2 were simulated to obtain the results P1 and P2, and the values of P1-P2 were calculated to get the impact values (IV) of the feature on the model output. The MIV was calculated by averaging the IVs by the number of observations.
(3) The MIVs of all features were calculated and then sorted according to their absolute values. Then, the relative contribution ratio of each feature to the output was calculated, according to the following equation:

Modeling Methods
Various factors influence the BP, and the relationships between features and BP are complicated and lack clear mechanisms. To adapt to the nonlinearity of the dataset and overcome the shortcomings of traditional fitting methods, the support vector regression (SVR) was used to construct the SBP and DBP estimation models. Further, the proposed models were compared with the traditional models using the Multivariate Linear Regression (MLR) models and the PTTc-based SVR models.

SVR
SVR had many unique advantages in solving small sample and nonlinear regression problems. In the SVR, the input sample x was mapped into a high-dimensional feature space by the nonlinear mapping Φ(x), and then a linear model was built in this feature space to estimate the regression function. The equation used was as follows: where ω is the weight vector and b is the threshold. In this study, for a given training set, the ε-insensitive loss function was used, and the corresponding SVR was called ε-SVR. Thus, the following constrained optimization problems needed to be solved: where c is a penalty factor, and ξ i and ξ * i are different relaxation factors. For easier computation, the Lagrange multipliers were introduced to transform the aforementioned constrained optimization problems into a dual problem. The solution of Equation (6) was as follows: where, α i and α * i are Lagrange multipliers corresponding to support vectors (SVs), l is the number of SVs, and K(x i , x) is a kernel function. However, in SVR, different kernel functions had great impacts on the fitting results. The preferred kernel function in this study was the radial basis function (RBF), which was as follows: where γ is the kernel parameter. Compared with the linear kernel, the RBF kernel projected the dataset into a higher-dimensional space nonlinearly and "nonlinearized" the original linear algorithm, making it possible to effectively deal with the nonlinear relationship between features and BP. Compared with the polynomial kernel, it had fewer tuning parameters and reduced the complexity in model selection. Equations (7) and (9) indicate that selecting an appropriate penalty factor c, kernel parameter γ, and loss function ε could effectively improve the expansion of the SVR model. However, no uniform guidelines were present for parameter selection. How to quickly and effectively choose parameters was the key to the predictive power of the model. In this study, the LIBSVM toolbox was used to perform SVR model construction [32].

MLR
MLR has been widely applied as a method to analyze the correlation, correlation direction, and strength between multiple independent variables and the dependent variable. In this study, an MLR model was built for the comparative purpose, and the model parameters were estimated using the least squares method. The model was expressed as follows: where P is BP, β 1 , β 2 , . . . , β k are model parameters, and x 1 , x 2 , . . . , x k are input features.

Parameter Optimization
For the aforementioned SVR model parameters, the traditional method allowed c, γ, and ε to take values within a certain range. Then, according to a set of selected parameters, the training set was used as the original dataset to calculate the model accuracy, and the set of parameters giving the highest accuracy was taken as the optimal parameters. Besides being time-consuming and laborious, this method tended to fall into a local optimal solution, which was not good for parameter optimization in a wide range. Therefore, the genetic algorithm (GA) was used in this study for parameter optimization.
GA is a computational model that simulates the evolutionary process according to the natural selection of Darwin's theory of evolution and the genetic mechanisms, which includes a method of searching for optimal solutions by simulating the natural process of evolution. Adopting the probabilistic optimization method, GA could automatically acquire and direct the optimized searching space with no definite rule and adjust the searching direction adaptively. It had a superior global optimization capability. The algorithm flowchart is shown in Figure 3. The detailed steps were as follows: (1) Binary coding. Binary coding of c, γ, and ε was performed to make a binary string, which served as a corresponding individual solution in the solution space, that is, the parameter ranges. Individuals made up a population.
(2) Initial population. A population was randomly generated and input into the fitness function to calculate and evaluate the fitness score of each individual.
(3) Selection, crossover, and mutation. For the initial population, the selection was performed according to the fitness scores, crossing over according to the crossover probability, and mutating according to the mutation probability to generate an offspring population.
(4) Evaluation and saving of the fitness of offspring individual. The fitness score of each individual was evaluated in the offspring population, and the local optimal solution was output.
(5) Output optimal solution. Once the max generation of the genetic operation was reached, the decoded optimal parameters were output. was used as the original dataset to calculate the model accuracy, and the set of parameters giving the highest accuracy was taken as the optimal parameters. Besides being time-consuming and laborious, this method tended to fall into a local optimal solution, which was not good for parameter optimization in a wide range. Therefore, the genetic algorithm (GA) was used in this study for parameter optimization.
GA is a computational model that simulates the evolutionary process according to the natural selection of Darwin's theory of evolution and the genetic mechanisms, which includes a method of searching for optimal solutions by simulating the natural process of evolution. Adopting the probabilistic optimization method, GA could automatically acquire and direct the optimized searching space with no definite rule and adjust the searching direction adaptively. It had a superior global optimization capability. The algorithm flowchart is shown in Figure 3. The detailed steps were as follows: (1) Binary coding. Binary coding of c, γ, and ε was performed to make a binary string, which served as a corresponding individual solution in the solution space, that is, the parameter ranges. Individuals made up a population.
(2) Initial population. A population was randomly generated and input into the fitness function to calculate and evaluate the fitness score of each individual.
(3) Selection, crossover, and mutation. For the initial population, the selection was performed according to the fitness scores, crossing over according to the crossover probability, and mutating according to the mutation probability to generate an offspring population.
(4) Evaluation and saving of the fitness of offspring individual. The fitness score of each individual was evaluated in the offspring population, and the local optimal solution was output.
(5) Output optimal solution. Once the max generation of the genetic operation was reached, the decoded optimal parameters were output. The parameters for GA optimization were set as follows: [0, 100] for c, [0, 1000] for γ, and [0.01, 1] for ε; 20 for initial population, 0.7 for crossover probability, 0.01 for mutation probability, and 100 for max generation. The parameters for GA optimization were set as follows: [0, 100] for c, [0, 1000] for γ, and [0.01, 1] for ε; 20 for initial population, 0.7 for crossover probability, 0.01 for mutation probability, and 100 for max generation.

Models Validation and Evaluation
In this study, the average mean squared error (average MSE) obtained from the fivefold cross-validation (5-CV) was used to measure the model accuracy.
This meant that the training set was randomly divided into five groups; each group was used as the test set, and the remaining four groups were trained to get the SVR model. The MSE values from the five validations were averaged to get the average MSE.
The mean absolute deviation (MAD) and the standard deviation (STD) were treated as indicators for evaluating the predictive performance of models.
where y i is the actual value of BP andŷ i is the predicted value of BP.

Preprocessing Results
The ECG signals collected contained two major types of noise: myoelectric interference and baseline drift. In this study, the Symlet 8 (sym8) wavelet was used to decompose the ECG waveform into eight scales. The approximate component of scale 8 was directly zeroed to remove the baseline drift partially, and the detail component of scale 1 was directly zeroed to remove part of the myoelectric interference. The other components were processed by combining the "sqtwolog" fixed threshold and the soft threshold function to effectively eliminate the noise and retain the main components of the waveform. The denoising results are shown in Figure 4.

Models Validation and Evaluation
In this study, the average mean squared error (average MSE) obtained from the fivefold crossvalidation (5-CV) was used to measure the model accuracy.
This meant that the training set was randomly divided into five groups; each group was used as the test set, and the remaining four groups were trained to get the SVR model. The MSE values from the five validations were averaged to get the average MSE.
The mean absolute deviation (MAD) and the standard deviation (STD) were treated as indicators for evaluating the predictive performance of models.
where is the actual value of BP and is the predicted value of BP.

Preprocessing Results
The ECG signals collected contained two major types of noise: myoelectric interference and baseline drift. In this study, the Symlet 8 (sym8) wavelet was used to decompose the ECG waveform into eight scales. The approximate component of scale 8 was directly zeroed to remove the baseline drift partially, and the detail component of scale 1 was directly zeroed to remove part of the myoelectric interference. The other components were processed by combining the "sqtwolog" fixed threshold and the soft threshold function to effectively eliminate the noise and retain the main components of the waveform. The denoising results are shown in Figure 4.

GA-SVR BP Models
In this study, first the training set was used to construct the initial SVR models, with default values for all parameters. However, the experiments showed poor accuracies of the initial SVR models. Taking the SBP model as an example, the average MSE obtained from 5-CV was 337.37, clearly showing underlearning, indicating that the default parameters failed to produce an effective model. Therefore, GA was used for parameter optimization to boost model accuracy and avoid overlearning or underlearning. At the same time, it contributed to the faithful description of the feature impact on the model during the later dimensionality reduction.
During optimization, the average MSE calculated by 5-CV also served as the fitness value of individuals in each generation for fitness evaluation, as shown in Figure 6. The average MSE and best MSE curves for all generations during evolution were plotted. From the best MSE curve, it was seen that the predictive ability of the SBP estimation model stabilized after 10 generations, while that of the DBP model stabilized after 20 generations.
The parameters obtained by GA optimization, the default parameters, and the model accuracies before and after optimization are shown in Table 1. It was seen that the accuracy enhanced by GA optimization was significant. The best MSE of the SBP estimation was improved to 38.33, and the best MSE of the DBP estimation was improved to 5.73. Therefore, the GA optimization was necessary before the features were subjected to MIV dimension reduction.

GA-SVR BP Models
In this study, first the training set was used to construct the initial SVR models, with default values for all parameters. However, the experiments showed poor accuracies of the initial SVR models. Taking the SBP model as an example, the average MSE obtained from 5-CV was 337.37, clearly showing underlearning, indicating that the default parameters failed to produce an effective model. Therefore, GA was used for parameter optimization to boost model accuracy and avoid overlearning or underlearning. At the same time, it contributed to the faithful description of the feature impact on the model during the later dimensionality reduction.
During optimization, the average MSE calculated by 5-CV also served as the fitness value of individuals in each generation for fitness evaluation, as shown in Figure 6. The average MSE and best MSE curves for all generations during evolution were plotted. From the best MSE curve, it was seen that the predictive ability of the SBP estimation model stabilized after 10 generations, while that of the DBP model stabilized after 20 generations.
The parameters obtained by GA optimization, the default parameters, and the model accuracies before and after optimization are shown in Table 1. It was seen that the accuracy enhanced by GA optimization was significant. The best MSE of the SBP estimation was improved to 38.33, and the best MSE of the DBP estimation was improved to 5.73. Therefore, the GA optimization was necessary before the features were subjected to MIV dimension reduction.

GA-SVR BP Models
In this study, first the training set was used to construct the initial SVR models, with default values for all parameters. However, the experiments showed poor accuracies of the initial SVR models. Taking the SBP model as an example, the average MSE obtained from 5-CV was 337.37, clearly showing underlearning, indicating that the default parameters failed to produce an effective model. Therefore, GA was used for parameter optimization to boost model accuracy and avoid overlearning or underlearning. At the same time, it contributed to the faithful description of the feature impact on the model during the later dimensionality reduction.
During optimization, the average MSE calculated by 5-CV also served as the fitness value of individuals in each generation for fitness evaluation, as shown in Figure 6. The average MSE and best MSE curves for all generations during evolution were plotted. From the best MSE curve, it was seen that the predictive ability of the SBP estimation model stabilized after 10 generations, while that of the DBP model stabilized after 20 generations.
The parameters obtained by GA optimization, the default parameters, and the model accuracies before and after optimization are shown in Table 1. It was seen that the accuracy enhanced by GA optimization was significant. The best MSE of the SBP estimation was improved to 38.33, and the best MSE of the DBP estimation was improved to 5.73. Therefore, the GA optimization was necessary before the features were subjected to MIV dimension reduction.

GA-MIV-SVR BP Models
The constructed GA-SVR BP model already achieved a certain accuracy. It described the actual impact of features on the model in the study of the feature MIVs. The calculated MIV, relative contribution ratio, and cumulative contribution ratio of each feature with respect to the SBP and DBP models are listed in Tables 2 and 3. It was seen that the MIVs of those two models ranked differently. This demonstrated that the correlations of each feature with respect to SBP and DBP were different, consistent with the general physiological rules. In this study, the features contributing to 90% of cumulative contribution ratio were retained, and those with small MIVs were removed. It resulted in the first eight features in Table 2 being kept for the SBP estimation model and the first nine in Table 3 for the DBP model. The reduction in features could effectively increase the efficiency of feature extraction. Meanwhile, the feature reduction resulted in a new dataset, and the models based on the previous dataset needed to be modified accordingly. Therefore, to find the optimal parameters adapted to the new dataset and investigate whether the model accuracy was enhanced after dimensionality reduction, another round of GA optimization was needed for establishing the final BP estimation models. Similarly, the GA optimization process after dimensionality reduction is shown in Figure 7. The model accuracy comparison before and after the MIV process is shown in Table 4. It was seen that the best MSEs were further reduced. Although the accuracy improvements in the present round using the GA-MIV-SVR models were smaller than those in the last round using the GA-SVR models, it still proved that the MIV feature dimensionality reduction was effective, providing strong evidence for the correlation analysis between features and models.
Sensors 2019, 19, x FOR PEER REVIEW 11 of 18 In this study, the features contributing to 90% of cumulative contribution ratio were retained, and those with small MIVs were removed. It resulted in the first eight features in Table 2 being kept for the SBP estimation model and the first nine in Table 3 for the DBP model. The reduction in features could effectively increase the efficiency of feature extraction. Meanwhile, the feature reduction resulted in a new dataset, and the models based on the previous dataset needed to be modified accordingly. Therefore, to find the optimal parameters adapted to the new dataset and investigate whether the model accuracy was enhanced after dimensionality reduction, another round of GA optimization was needed for establishing the final BP estimation models. Similarly, the GA optimization process after dimensionality reduction is shown in Figure 7. The model accuracy comparison before and after the MIV process is shown in Table 4. It was seen that the best MSEs were further reduced. Although the accuracy improvements in the present round using the GA-MIV-SVR models were smaller than those in the last round using the GA-SVR models, it still proved that the MIV feature dimensionality reduction was effective, providing strong evidence for the correlation analysis between features and models.

Model Robustness and Comparison
The robustness of the constructed BP estimation models was verified using the test set with 20% data. To evaluate the proposed models in an objective manner, the MLR models and the PTTc-based SVR models were constructed from the same dataset and compared with the corresponding proposed models.
The robustness comparison among the GA-MIV-SVR BP models and the traditional MLR BP models and the PTTc-based SVR BP models is shown in Figure 8 and Table 5. It was seen that the final constructed SBP and DBP estimation models by GA-MIV-SVR in this study generated predicted values closest to the actual values, with the highest prediction accuracies among the three methods. The SBP error was 3.27 ± 5.52 mmHg, and the DBP error was 1.16 ± 1.97 mmHg, fully satisfying the AAMI criteria with mean error ≤5 mmHg and STD ≤8 mmHg. Although the DBP estimation errors by the other two methods met the criteria, the SBP errors failed. In addition, the three prediction accuracies were evaluated according to the BHS grading system, as shown in Table 6. The results

Model Robustness and Comparison
The robustness of the constructed BP estimation models was verified using the test set with 20% data. To evaluate the proposed models in an objective manner, the MLR models and the PTTc-based SVR models were constructed from the same dataset and compared with the corresponding proposed models.
The robustness comparison among the GA-MIV-SVR BP models and the traditional MLR BP models and the PTTc-based SVR BP models is shown in Figure 8 and Table 5. It was seen that the final constructed SBP and DBP estimation models by GA-MIV-SVR in this study generated predicted values closest to the actual values, with the highest prediction accuracies among the three methods. The SBP error was 3.27 ± 5.52 mmHg, and the DBP error was 1.16 ± 1.97 mmHg, fully satisfying the AAMI criteria with mean error ≤5 mmHg and STD ≤8 mmHg. Although the DBP estimation errors by the other two methods met the criteria, the SBP errors failed. In addition, the three prediction accuracies were evaluated according to the BHS grading system, as shown in Table 6. The results demonstrated that the cumulative errors of the SBP model constructed by the GA-MIV-SVR method were 77.8% (≤5 mmHg), 96.7% (≤10 mmHg), and 99.3% (≤15 mmHg), respectively, and of the DBP models were 98.7% (≤5 mmHg), 100% (≤10 mmHg), and 100% (≤15 mmHg), respectively, both met the grade A criteria of the BHS system.       To better analyze the correlations between the predicted and the actual values of SBP and DBP models, the Pearson correlation coefficient (r 2 ) and the Bland-Altman plots were both introduced. It was generally considered that 0.2 < r 2 < 0.4 indicated a weak correlation, 0.4 < r 2 < 0.6 a medium correlation, 0.6 < r 2 < 0.8 a strong correlation, and 0.8 < r 2 < 1.0 an extremely strong correlation. Figure 9 exhibits the Pearson correlation coefficient distributions and the Bland-Altman plots for the predicted and actual values of the SBP and DBP models by all three methods. It was shown that the models by GA-MIV-SVR showed extremely strong correlations between the predicted and the actual values, superior to the other two methods, indicating a highly linear relationship between the predicted values and the actual values. Besides, the Bland-Altman plots exhibited that for the SBP and DBP models by GA-MIV-SVR, the mean difference was 0.03 and 0.10 (close to 0) and 95.4% and 96.7% predicted values fell within the 95% limits of agreement, respectively, the [d − 1.96SD, d + 1.96SD]. This demonstrated that the predicted and the actual values were highly consistent, and the models proposed in this study could be used for non-invasive continuous BP estimation clinically.  To better analyze the correlations between the predicted and the actual values of SBP and DBP models, the Pearson correlation coefficient ( ) and the Bland-Altman plots were both introduced. It was generally considered that 0.2 < < 0.4 indicated a weak correlation, 0.4 < < 0.6 a medium correlation, 0.6 < < 0.8 a strong correlation, and 0.8 < < 1.0 an extremely strong correlation. Figure 9 exhibits the Pearson correlation coefficient distributions and the Bland-Altman plots for the predicted and actual values of the SBP and DBP models by all three methods. It was shown that the models by GA-MIV-SVR showed extremely strong correlations between the predicted and the actual values, superior to the other two methods, indicating a highly linear relationship between the predicted values and the actual values. Besides, the Bland-Altman plots exhibited that for the SBP and DBP models by GA-MIV-SVR, the mean difference was 0.03 and 0.10 (close to 0) and 95.4% and 96.7% predicted values fell within the 95% limits of agreement, respectively, the [ − 1.96 , + 1.96 ]. This demonstrated that the predicted and the actual values were highly consistent, and the models proposed in this study could be used for non-invasive continuous BP estimation clinically.

Discussion
In this study, non-invasive continuous BP estimation models were proposed based on machine learning. Different from the traditional PTT-based BP estimation models, more BP-related features were introduced to model construction via information fusion. The proposed method for model construction effectively boosted the accuracy of BP estimation and enhanced the predictive performance and generalization ability of the BP estimation models.

Basis for Feature Selection and Dimensionality Reduction
The formation of BP in the human body is directly related to factors such as cardiac output, peripheral resistance, and degree of arteriosclerosis. In terms of feature selection, in addition to the features already proved to be related to the factors, such as PTT and K values, the features proposed in this study were more or less associated with these factors too. Although no definitive experiments proved the physiological mechanisms of correlation, some inferences made were based on physiological rules and related studies. For example, Tupr corresponded to the rapid ejection phase of the left ventricle; therefore, presumably it was related to cardiac output. Keeping cardiac output and peripheral resistance constant, changes in HR caused variations in BP. Cslope might be associated with blood viscosity [33]. Her and Hgr could tell something about aortic compliance [29]. S1 and S2 could also describe some characteristics of vascular elasticity and blood viscosity [34,35].
Based on the aforementioned data, this study attempted to describe the nonlinear relationship between features and BP by exploring the impact of each feature on the models. Compared with the use of the traditional Pearson correlation coefficient to describe the linear relationship, the MIV method was more rigorous and consistent with physiological rules. For example, the MIV of the K value was the largest in the SBP estimation model, while clinically the K value served as the key feature of the SBP estimation. The MIV of HR was the biggest in the DBP model, exhibiting the impact of HR on DBP. Of course, speculations proved to be unreasonable. For instance, the MIVs of Tupr ranked low in both SBP and DBP estimation models; hence, it was discarded in the subsequent study. In the absence of definite physiological mechanisms, the MIV method used in this study provided strong evidence and sufficient rationale for feature selection and dimensionality reduction. The MIV method not only described the nonlinear relationship between features and BP but also removed the redundant features, reducing the risk of insufficient generalization capabilities due to the inclusion of unrelated features.
The feature selection in this study still had some limitations. In the data collection stage, bounded by the database, the individual's personal information could not be retrieved as features. It was expected that the addition of some personal characteristics, such as height, weight, age, and gender, might further strengthen the predictive ability of the models. In a previous study [36], the body mass index, waist circumference, hip circumference, and waist-to-hip ratio were introduced to predict the increase in BP. In a previous study [37], the BP neural network was constructed based on the height, weight, age, gender, and other characteristics for BP estimation. Furthermore, the introduction of personal features might possibly eliminate the cumbersome calibration process.

Model Methods and Limitations
Compared with the traditional PTT-based linear models [12], this study introduced some reasonable PPG waveform features for model construction and effectively improved the accuracy of SBP and DBP estimations. Compared with the MLR models [13,38], the models established in this study based on machine learning demonstrated a more complicated nonlinear relationship between the waveform characteristics and BP and greatly enhanced the predictive ability and robustness of the models.
From the perspective of the proposed method, this study adjusted the model parameters by GA optimization to achieve a better performing model. Compared with the traditional manual adjustment of parameters or directly using the default values, the proposed models had minimum underlearning or overlearning, greatly improving the optimization efficiency and facilitating model calibration. From the application scenario, the proposed models were satisfactorily applied in the continuous non-invasive monitoring of BP clinically. Getting rid of the constraints of the traditional cuff sphygmomanometer, these models helped doctors to gather more information about the changes in BP of the patients and get early warnings of diseases, leading to better management of BP. However, the model method of this study had some limitations. First, as mentioned earlier, no personal information was introduced as features, and the study could not eliminate the influence of individual differences on the models. Second, no accuracy verification of long-term monitoring was conducted. It is yet impossible to know whether the accuracy of BP estimations will decrease when the models are monitored for a long time, for example, 1 week, 1 month, or 6 months. Last but not least, the data size in this study was not big enough. The diversity and size of data need to be further expanded. Additionally, the generalization ability of the models was affected to a certain degree.

Conclusions
This study integrated the characteristics of pulse waveforms and proposed a set of more effective BP estimation models using the GA-MIV-SVR method. The feature selection relied heavily on the characteristics describing the BP formation. The MIV measured the impact of features on the models to reduce the feature redundancy. In addition, the optimization method was introduced to improve the efficiency of parameter optimization; it effectively boosted the accuracies of SBP and DBP estimations by reducing underlearning or overlearning. This study was helpful for the wide application of non-invasive continuous BP measurement models. However, a more diverse and bigger dataset and long-term monitoring experiments are needed to further test the generalization ability and robustness of the models.