Estimation of Stroke Volume Variance from Arterial Blood Pressure: Using a 1-D Convolutional Neural Network

Background: We aimed to create a novel model using a deep learning method to estimate stroke volume variation (SVV), a widely used predictor of fluid responsiveness, from arterial blood pressure waveform (ABPW). Methods: In total, 557 patients and 8,512,564 SVV datasets were collected and were divided into three groups: training, validation, and test. Data was composed of 10 s of ABPW and corresponding SVV data recorded every 2 s. We built a convolutional neural network (CNN) model to estimate SVV from the ABPW with pre-existing commercialized model (EV1000) as a reference. We applied pre-processing, multichannel, and dimension reduction to improve the CNN model with diversified inputs. Results: Our CNN model showed an acceptable performance with sample data (r = 0.91, MSE = 6.92). Diversification of inputs, such as normalization, frequency, and slope of ABPW significantly improved the model correlation (r = 0.95), lowered mean squared error (MSE = 2.13), and resulted in a high concordance rate (96.26%) with the SVV from the commercialized model. Conclusions: We developed a new CNN deep-learning model to estimate SVV. Our CNN model seems to be a viable alternative when the necessary medical device is not available, thereby allowing a wider range of application and resulting in optimal patient management.


Introduction
Given the importance of adequate fluid management to a patient's surgical outcome, stroke volume variation (SVV), an index of fluid responsiveness, is widely used to guide fluid therapy in patients with mechanical ventilation [1]. Optimal oxygen delivery is one of the main goals in patient management and is associated with cardiac output maximization. Individualization of hemodynamic therapy and goal-directed therapy is an emerging concept to obtain cardiac output maximization based on the fluid responsiveness prediction [2]. Evaluation of fluid responsiveness has evolved from classical fluid bolus test (infusing a small volume of fluid within a short time, discarded for harming fluid non-responders with fluid overloading) to monitoring dynamic hemodynamic parameters such as SVV, which is a quantification of the respiratory variation of stroke volume from theoretical heart-lung interaction principles [3]. Goal-directed fluid therapy guided by SVV, a dynamic hemodynamic variable, is proven to be beneficial for surgical patients [4][5][6], with numerous literatures demonstrating improved patient prognosis with SVV use [5,7,8]. and SVV from EV1000. Given that breathing cycles occur every 6 to 8 s during mechanical respiration and EV1000 calculates SVV every 2 s, data sets are reconstructed with a 10 s ABP waveform to include at least one breathing cycle and corresponding SVV value recorded for 2-s intervals. The input data consisted of three channels. The first channel is normalized ABP waveform, by removing direct current components. The second channel was frequency data from 1 to 12.2 Hz, calculated with Fast-Fourier Transform. The last channel data was the slope of the ABP waveform. The output was the estimated SVV value of the corresponding time point. We used SVV from EV1000 as a reference.

CNN and Model Improvements
We customized our model based on the referential CNN model by VGGNET [23]. The model was built as follows. First, input sequence is 1000 samples obtained at the 100 Hz sampling rates (10 s). The dataset was collected during the entire surgical period and recorded for every two-seconds. The model consisted of 16 convolutional layers and a single fully connected layer. For direct current (DC) offset, we converted a signal from time domain to a representation in the frequency domain using Fast Fourier Transform, and removed the values in 0 hertz [24]. For dimension reduction, every two convolutional layers used one convolutional strides layer [25]. The length of the input variable starts with 1000 and halves with each convolutional stride layer. After dimension reduction, filters started at 64 and increased continuously until 1024. The kernel size started at 12 started decrease continuously to 3 ( Figure 1).

Data acquisition and Pre-processing
Data were collected using a computer application of the medical record software called Vital Recorder [22]. Collected variables include electrocardiogram (ECG), ABP waveform, central venous pressure, pulmonary arterial pressure, and heart rate from Bx50 and SVV from EV1000. Given that breathing cycles occur every 6 to 8 s during mechanical respiration and EV1000 calculates SVV every 2 s, data sets are reconstructed with a 10 s ABP waveform to include at least one breathing cycle and corresponding SVV value recorded for 2-s intervals. The input data consisted of three channels. The first channel is normalized ABP waveform, by removing direct current components. The second channel was frequency data from 1 to 12.2 Hz, calculated with Fast-Fourier Transform. The last channel data was the slope of the ABP waveform. The output was the estimated SVV value of the corresponding time point. We used SVV from EV1000 as a reference.

CNN and Model Improvements
We customized our model based on the referential CNN model by VGGNET [23]. The model was built as follows. First, input sequence is 1000 samples obtained at the 100 Hz sampling rates (10 s). The dataset was collected during the entire surgical period and recorded for every two-seconds. The model consisted of 16 convolutional layers and a single fully connected layer. For direct current (DC) offset, we converted a signal from time domain to a representation in the frequency domain using Fast Fourier Transform, and removed the values in 0 hertz [24]. For dimension reduction, every two convolutional layers used one convolutional strides layer [25]. The length of the input variable starts with 1000 and halves with each convolutional stride layer. After dimension reduction, filters started at 64 and increased continuously until 1024. The kernel size started at 12 started decrease continuously to 3 ( Figure 1). Setting up hypervariables on CNN models was done as per the temporal convolutional networks experimental guidelines [26]. Considering memory and learning time, we set the batch size to 64 and the leading rate to 10 −5. The loss function evaluated the mean squared error (MSE) from predicted SVV and reference SVV value. The model was trained using Adam optimization, a gradient descent method [27]. We used packages for deep leaning implemented by Keras library (https://github.com/keras-team/keras (accessed on October 2020)) and Python 3.6. We trained a deep learning model using a GPU server with 4 GTX-1080Ti GPUs. Setting up hypervariables on CNN models was done as per the temporal convolutional networks experimental guidelines [26]. Considering memory and learning time, we set the batch size to 64 and the leading rate to 10 −5 . The loss function evaluated the mean squared error (MSE) from predicted SVV and reference SVV value. The model was trained using Adam optimization, a gradient descent method [27]. We used packages for deep leaning implemented by Keras library (https://github.com/keras-team/keras (accessed on 14 October 2020)) and Python 3.6. We trained a deep learning model using a GPU server with 4 GTX-1080Ti GPUs.

Statistical analysis
Variables are expressed as numbers (percentages), mean ± standard deviation, or median (interquartile range), as appropriate. The intergroup analysis was performed using the student's t-test, Mann-Whitney U test, logistic regression, the analysis of variance, or Kruskal-Wallis test were used for analyzing continuous variables and χ 2 test or Fisher's exact test for categorical variables. Linear regression analysis (Pearson correlation) was used to provide a measure of association between estimated and reference SVV. The Pearson correlation coefficient r is ranged from −1 to +1, with a higher absolute value indicating stronger association, i.e., good performance of the proposed model. The MSE and mean absolute error (MAE) was calculated to show the difference between the values. SVV obtained with our CNN models with changes of pre-processing and dimension reduction of input variable, and EV1000 was compared using Bland and Altman method, calculating bias (mean difference between measurements) and limits of agreement (equal to bias ± 2 SDs of the bias); this calculation will yield a positive number when the SVV from the proposed CNN model is higher than SVV from EV1000. A quadrant plot with concordance rate was used for the trend analysis. A 5% margin of error was used to calculate the concordance rate. We considered a p value < 0.05 to be statistically significant.

Experiment: Step-wise procedure
First, we checked the performance according to the model type using sample data. A total of six model types were tested by combining two basic preprocessing methods (min-max normalization and DC offset) and three reduction techniques (max pooling, average pooling, and using only convolution strides).
Next, we tried to increase performance by inputting additional information. The two additional inputs we entered are the 'frequency' and 'slope' of the ABP waveform. 'Frequency' is information in the frequency-domain, and since it contains information about heart rate, it can be an important factor in the ejection of blood from the heart. 'Slope' is the time derivative of ABP waveform. In each ABP waveform pattern, the slope of the epoch in which the pressure rises is an indicator of how strongly the heart ejected blood.

Results
The characteristics of the patients are displayed in Table 1.  Table 2 shows the hemodynamic variables including SVV measured from EV1000 in all data sets.  Table 3 shows the performance of model according to the applied deep learning technique.
We added pre-processing and dimensional reduction to the basic CNN model in the sample dataset, which was consisted of training (n = 33, data = 418,121) and validation sets (n = 14, data = 204,435). Removing DC offset increased model performance (correlation = 0.83, MSE = 9.3) and dimension reduction replacing the convolutional strides further increased performance (correlation: 0.83 and 0.91; MSE: 9.3 and 6.92) ( Table 3). Table 3. Correlation between SVV derived from our CNN models with pre-processing and dimension reduction changes from the input variable and SVV from the commercialized model as a reference.  Table 4 shows model improvements (i.e., increase in Pearson correlation coefficients) according to the diversified inputs of the ABP waveform. Inputs of pre-processed and slope of ABP waveform showed a good performance, assessed by the high Pearson correlation coefficients (r = 0.93 and 0.93, respectively). The combined inputs of pre-processed and frequency signals showed improved performance, shown by increased Pearson correlation coefficients to 0.95. Figure 2 is the representative plot of the SVV derived from our model and EV1000. We added pre-processing and dimensional reduction to the basic CNN model in the sample dataset, which was consisted of training (n = 33, data = 418,121) and validation sets (n = 14, data = 204,435). Removing DC offset increased model performance (correlation = 0.83, MSE = 9.3) and dimension reduction replacing the convolutional strides further increased performance (correlation: 0.83 and 0.91; MSE: 9.3 and 6.92) ( Table 3). Table 4 shows model improvements (i.e., increase in Pearson correlation coefficients) according to the diversified inputs of the ABP waveform. Inputs of pre-processed and slope of ABP waveform showed a good performance, assessed by the high Pearson correlation coefficients (r = 0.93 and 0.93, respectively). The combined inputs of pre-processed and frequency signals showed improved performance, shown by increased Pearson correlation coefficients to 0.95. Figure 2 is the representative plot of the SVV derived from our model and EV1000.   To assess the agreement between the SVV from proposed CNN model and that of EV1000, Bland-Altman analysis was performed. The mean difference (bias) was small (Bias: −0.85, 95% CI, −2.88-0.71), which indicates that the difference between SVV from the proposed CNN model and EV1000 was small. In the four-quadrant plot, the proposed CNN model's trending ability to track SVV from EV1000 was assessed. The concordance rate from the four-quadrant plot was 96.26%, indicating the good trending ability of our proposed CNN model (Figure 3). To assess the agreement between the SVV from proposed CNN model and that of EV1000, Bland-Altman analysis was performed. The mean difference (bias) was small (Bias: −0.85, 95% CI, −2.88-0.71), which indicates that the difference between SVV from the proposed CNN model and EV1000 was small. In the four-quadrant plot, the proposed CNN model's trending ability to track SVV from EV1000 was assessed. The concordance rate from the four-quadrant plot was 96.26%, indicating the good trending ability of our proposed CNN model (Figure 3). In the deep learning model, errors (MAE) and MSE are used to measure the model performance. Model with ABP waveforms and frequency as input showed high errors (MAE = 1.55 and 1.52, respectively; MSE = 4.74 and 5.08, respectively), whereas inputs with the pre-processed and slope of ABP waveforms showed lower errors (MAE = 1.30 and 1.38, respectively; MSE = 4.08 and 4.59, respectively). Combination of pre-processed and slope of ABP waveforms and all input vectors showed the lowest error rate (MAE: 1.24 and 1.01, respectively; MSE: 3.18 and 2.13, respectively).

Discussion
In this study, we estimated SVV through a deep-learning approach, using arterial blood pressure waveform solely as the input. Our proposed CNN model estimated SVV with an accuracy comparable to the widely used commercialized model (EV1000). Of note, our proposed method did not require additional devices, nor manual calibration or manual input of patients' data, which may be associated with unreliable estimation of SVV. However, our model is yet to be tested in real-time general anesthesia surgery for clinical application.
The proposed CNN model may be utilized to overcome the limitations of current commercialized models. Although the benefit of SVV is well-documented, currently clinicians must decide whether to use additional medical equipment (e.g., EV1000) to obtain SVV. Therefore, even if a patient had arterial line, SVV information might not be available if additional medical equipment was not applied or unavailable. In general surgery, clinicians decide on the extent of monitoring according to the patients' general condition and In the deep learning model, errors (MAE) and MSE are used to measure the model performance. Model with ABP waveforms and frequency as input showed high errors (MAE = 1.55 and 1.52, respectively; MSE = 4.74 and 5.08, respectively), whereas inputs with the pre-processed and slope of ABP waveforms showed lower errors (MAE = 1.30 and 1.38, respectively; MSE = 4.08 and 4.59, respectively). Combination of pre-processed and slope of ABP waveforms and all input vectors showed the lowest error rate (MAE: 1.24 and 1.01, respectively; MSE: 3.18 and 2.13, respectively).

Discussion
In this study, we estimated SVV through a deep-learning approach, using arterial blood pressure waveform solely as the input. Our proposed CNN model estimated SVV with an accuracy comparable to the widely used commercialized model (EV1000). Of note, our proposed method did not require additional devices, nor manual calibration or manual input of patients' data, which may be associated with unreliable estimation of SVV. However, our model is yet to be tested in real-time general anesthesia surgery for clinical application.
The proposed CNN model may be utilized to overcome the limitations of current commercialized models. Although the benefit of SVV is well-documented, currently clinicians must decide whether to use additional medical equipment (e.g., EV1000) to obtain SVV. Therefore, even if a patient had arterial line, SVV information might not be available if additional medical equipment was not applied or unavailable. In general surgery, clinicians decide on the extent of monitoring according to the patients' general condition and the complexity of surgery. However, unanticipated intraoperative hemodynamic instability due to surgical manipulation or massive bleeding may occur, and the fluid management of such patients would be enhanced with a dynamic hemodynamic variable such as SVV. Our model can obtain SVV values by implanting it to the device which obtains ABP waveform, solving the spatial problem. Also, our model can be economically beneficial to patients. The medical device with a commercialized model is costly, and patients are required to pay for the one-time-use disposable medical supply products to obtain SVV measurement. One of the advantages of using our deep learning model to calculate SVV is that an additional expense or medical device is not required. Therefore, with these deep learning model-based measurements, patients who did receive SVV-based fluid management will receive better anesthesia management with SVV monitoring without any additional financial burden The merit of the proposed algorithm is that it would not require additional monitoring devices. For example, our algorithm can be applied as add-on to any monitors with arterial waveform or even could be implanted into currently wide-used bedside monitors, such as Philips, GE, and Mindray. It would provide clinicians with real-time SVV without additional burden from medical cost or device and allows SVV to be much more easily accessed and more widely used, consequently assist more rational and informed decisions regarding fluid management. Our deep learning model used a CNN technique. In the area of signal analysis, most deep learning models use recurrent neural network models, such as Long Short Term Memory [19,28]. We applied Long Short Term memory to estimate SVV to our sample dataset. However, it took a very long time and did not perform well [29]. We also tested the repeated convolutional neural network (RCNN) model, but the performance was also low. The best performance was observed for the repeated CNN strides model with dimension reduction. The strong point of the recurrent neural network technique is that it can extract pattern consideration of time continuity, while CNN is more specialized in feature extraction but has less reflection on time continuity than RNN series models [29]. It may be speculated that these differences lead the CNN technique to be more suitable in our model compared to other well-known deep-learning techniques, since it was more important to analyze the signal itself than considering the time continuity in our study.
In a typical CNN model, normalization is performed by applying the min-max method to input data [6,[16][17][18][19][20]30]. However, when analyzing ABP waveform, min-max normalization may unintentionally lose valuable features that a waveform has. Therefore, the DC offset of the signal was removed without applying the min-max normalization. Moreover, max pooling, typically applied to CNN modelling, caused a problem with the ABP waveform. When processing the medical image analysis with deep-learning, high values often have essential information. However, in ABP signals, information is often present not only in high values but also in low or medium values, therefore, max pooling was not suitable for our model. Thus, we applied a convolutional strides layer to process dimension reduction [25]. This process significantly improved performance over the other models we tried. Our deep learning model performance was dependent on the input data processing. Input with ABP waveform or frequency of ABP waveform alone showed low performance, whereas data pre-processing significantly improved the performance of the deep learning model. Specifically, removing the DC offset and slope of the data showed significant improvement of the model. When our model was constructed with diverse inputs of data using several pre-processed methods, the results showed that the performance gradually improved with the increase in the number of input data to the multiple channel models. It may be speculated that the effectiveness of verification, rather than adding new information, is more beneficial in reducing over-fitting.

Conclusions
We developed a novel model for estimating SVV from ABP using deep learning method. In total, 8,512,564 SVV datasets were collected from 557 patients, each dataset consisting of SVV at 2-s intervals and ABP waveform for 10 s before each SVV. We built an CNN model with existing commercialized model (EV1000) as reference, which was improved by applying pre-processing of the waveform data, multi-channel, and dimension reduction. Compared to existing commercialized monitoring device (EV1000), our initial model showed acceptable performance with Pearson correlation r = 0.91 and mean squared error MSE = 6.92. The initial model performance was improved with ABP waveform pre-processing and input diversification such as frequency and slope of ABP, which increased the Pearson correlation to r = 0.95 and reduced the min-squared error to MSE = 2.13. Furthermore, our results showed a high concordance rate of 96.26%, compared to the pre-commercialized device (EV1000). Therefore, our new model can be applied to all environments where ABP waveform is measured without additional equipment, which may enhance the efficiency of patient management and eventually improve patient outcome.
This study has several limitations. Firstly, since the model was trained with the SVV of the EV1000 as a reference, there is a possibility that SVV might not correctly reflect volume status if there was a problem with the ABP waveform. However, we manually inspected the waveform for any abnormalities, and it is reported that SVV is a trusted variable of fluid reactivity in adults [31]. Secondly, our model should be validated with SVV calculated from a golden standard method, such as esophageal echocardiography Doppler, as a reference in a future study. Lastly, external validation with different social and ethnics were not performed. We performed internal validation in an effort to overcome the limitation; however, care should be taken in the generalization of our results.