Classiﬁcation of Photoplethysmographic Signal Quality with Deep Convolution Neural Networks for Accurate Measurement of Cardiac Stroke Volume

: As photoplethysmographic (PPG) signals are comprised of numerous pieces of important physiological information, they have been widely employed to measure many physiological parameters. However, only a high-quality PPG signal can provide a reliable physiological assessment. Unfortunately, PPG signals are easily corrupted by motion artifacts and baseline drift during recording. Although several rule-based algorithms have been developed for evaluating the quality of PPG signals, few artiﬁcial intelligence-based algorithms have been presented. Thus, this study aims to classify the quality of PPG signals by using two two-dimensional deep convolution neural networks (DCNN) when the PPG pulse is used to measure cardiac stroke volume (SV) by impedance cardiography. An image derived from a PPG pulse and its di ﬀ erential pulse is used as the input to the two DCNN models. To quantify the quality of individual PPG pulses, the error percentage of the beat-to-beat SV measured by our device and medis ® CS 2000 synchronously is used to determine whether the pulse quality is high, middle, or low. Fourteen subjects were recruited, and a total of 3135 PPG pulses (1342 high quality, 73 middle quality, and 1720 low quality) were obtained. We used a traditional DCNN, VGG-19, and a residual DCNN, ResNet-50, to determine the quality levels of the PPG pulses. Their results were all better than the previous rule-based methods. The accuracies of VGG-19 and ResNet-50 were 0.895 and 0.925, respectively. Thus, the proposed DCNN may be applied for the classiﬁcation of PPG quality and be helpful for improving the SV measurement in impedance cardiography.


Introduction
The photoplethysmographic (PPG) signal has been widely used to measure many physiological parameters, such as pulse rate [1], blood oxygen saturation [2], blood pressure [3], respiration rate [4], and left ventricular ejection time (LVET) [5]. The noninvasive techniques for photoplethysmography include two optical types, transmission and reflection [6], as shown in Figure 1. A light-emitting diode (LED) is often used to generate low-intensity infrared light on the skin, and a portion of the light will be absorbed mainly by both arterial and venous blood. For the reflection PPG, the nonabsorbed light will be reflected and detected by a photo diode. The LED and photo diode are placed on the same side, as shown in Figure 1a. For the transmission PPG, the nonabsorbed light will be transmitted and detected by a photo diode. The LED and photo diode are placed on the opposite side, as shown in Figure 1b. In either the reflection or the transmission method, the PPG signal represents the changes in blood volume (Figure 1), although it cannot be used to quantify the amount of blood.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 2 of 16 a portion of the light will be absorbed mainly by both arterial and venous blood. For the reflection PPG, the nonabsorbed light will be reflected and detected by a photo diode. The LED and photo diode are placed on the same side, as shown in Figure 1a. For the transmission PPG, the nonabsorbed light will be transmitted and detected by a photo diode. The LED and photo diode are placed on the opposite side, as shown in Figure 1b. In either the reflection or the transmission method, the PPG signal represents the changes in blood volume (Figure 1), although it cannot be used to quantify the amount of blood.
(a) (b) The PPG signal measured by the reflection method is more easily corrupted by the motion noise than the transmission method. Currently, the wearable device with the PPG sensor usually uses the reflection method to perform the physiological measurement. When the users wear these devices to do the exercise, PPG signals are very vulnerable to motion artifacts. The common solution is to improve device mechanisms to reduce the motion effect on the recoded PPG signal and to raise the measured accuracy. However, the higher the quality of the PPG signal, the better the accuracy of the measured parameters extracted from such a PPG signal. Therefore, how to classify the quality of PPG signals is an important issue for the development of wearable devices.
PPG is a noninvasive optical measurement method in which the change of blood volume interconnects the physiological responses to circulatory events in peripheral blood vessels. Thus, its waveform bears regular morphological characteristics [7,8]. As shown in Figure 2, there are a lot of physiological characteristics in a PPG pulse, including the main peak, dicrotic notch, pulse width, and amplitude, and so on. As a result, many researchers have used those significant characteristics (i.e., the rule-based methods) to determine the quality of each PPG pulse. In addition, the signal quality index (SQI) represents the corrupted degree of the PPG pulse. Liu et al. have employed fuzzy rules to determine the SQI of PPG pulses [9], and Fischer et al. have applied the characteristics of PPG waveform and the decision tree to classify its SQI [10]. Li et al. have used the Bayesian hypothesis testing method to analyze the SQI [11]. In these studies, they all needed to adjust the thresholds of the rule-base method to get the best results. Recently, Liu et al. used a fuzzy neural network to evaluate the SQI [12]. Although they used the artificial intelligence method to gauge the quality of the minorly corrupted PPG pulse, the rule-base method was also used to delete the majorly corrupted PPG pulses. The PPG signal measured by the reflection method is more easily corrupted by the motion noise than the transmission method. Currently, the wearable device with the PPG sensor usually uses the reflection method to perform the physiological measurement. When the users wear these devices to do the exercise, PPG signals are very vulnerable to motion artifacts. The common solution is to improve device mechanisms to reduce the motion effect on the recoded PPG signal and to raise the measured accuracy. However, the higher the quality of the PPG signal, the better the accuracy of the measured parameters extracted from such a PPG signal. Therefore, how to classify the quality of PPG signals is an important issue for the development of wearable devices.
PPG is a noninvasive optical measurement method in which the change of blood volume interconnects the physiological responses to circulatory events in peripheral blood vessels. Thus, its waveform bears regular morphological characteristics [7,8]. As shown in Figure 2, there are a lot of physiological characteristics in a PPG pulse, including the main peak, dicrotic notch, pulse width, and amplitude, and so on. As a result, many researchers have used those significant characteristics (i.e., the rule-based methods) to determine the quality of each PPG pulse. In addition, the signal quality index (SQI) represents the corrupted degree of the PPG pulse. Liu et al. have employed fuzzy rules to determine the SQI of PPG pulses [9], and Fischer et al. have applied the characteristics of PPG waveform and the decision tree to classify its SQI [10]. Li et al. have used the Bayesian hypothesis testing method to analyze the SQI [11]. In these studies, they all needed to adjust the thresholds of the rule-base method to get the best results. Recently, Liu et al. used a fuzzy neural network to evaluate the SQI [12]. Although they used the artificial intelligence method to gauge the quality of the minorly corrupted PPG pulse, the rule-base method was also used to delete the majorly corrupted PPG pulses.
The traditional approach for the SQI assessment is to extract the features from the PPG signal. It is well known that the morphological approach is sensitive to signal noise and has many limitations on the performance robustness of the classification model [8]. Currently, deep learning techniques have been used to process feature extraction tasks by convolution computation [13]. As the physiological signals, such as electrocardiograms (ECGs), electroencephalograms, and PPGs, belong to one-dimensional signals, several studies have used a one-dimensional deep convolution neural network (1D DCNN) to classify the different arrhythmic types and the signal quality [14][15][16][17]. Some studies have transferred the 1D signal to a two-dimensional (2D) signal by short time frequency transform [18], wavelet transform [19], and power spectral density [20]. Then, the 2D DCNN employed these images as the input to do the classifications. However, in these studies, a segment signal of about 2 to 5 s was transferred to an image. Thus, the methods only were suitable for processing consistent, continuous signals.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 3 of 16 Figure 2. The characteristics of a PPG pulse chiefly include the main peak, dicrotic notch, pulse width, and pulse amplitude.
The traditional approach for the SQI assessment is to extract the features from the PPG signal. It is well known that the morphological approach is sensitive to signal noise and has many limitations on the performance robustness of the classification model [8]. Currently, deep learning techniques have been used to process feature extraction tasks by convolution computation [13]. As the physiological signals, such as electrocardiograms (ECGs), electroencephalograms, and PPGs, belong to one-dimensional signals, several studies have used a one-dimensional deep convolution neural network (1D DCNN) to classify the different arrhythmic types and the signal quality [14][15][16][17]. Some studies have transferred the 1D signal to a two-dimensional (2D) signal by short time frequency transform [18], wavelet transform [19], and power spectral density [20]. Then, the 2D DCNN employed these images as the input to do the classifications. However, in these studies, a segment signal of about 2 to 5 s was transferred to an image. Thus, the methods only were suitable for processing consistent, continuous signals.
When the heart is in the systolic phase, an amount of blood, which is called the stroke volume (SV), is pumped into systematic circulation. In this period, the volume of the thoracic cavity will change. Thus, impedance cardiography (ICG) was proposed by Kubicek et al., which is a noninvasive measurement technique to measure cardiac hemodynamic parameters [21]. In the SV calculation with ICG technology, LVET is one of the most important parameters, which is measured by the ICG signal. As the signal-to-noise ratio of the ICG signal is very low, the accuracy of LVET is not high enough. Liu et al. used a reflective PPG sensor placed on the neck to measure the LVET [5]. The LVETs measured by the PPG pulses were more accurate than those by the ICG pulses. Therefore, the accuracy of SV measured by their proposed technique was higher than the traditional ICG technique.
According to previous mentions, the higher the SQI of the PPG signal, the more accurate the physiological parameters measured by the PPG signal. Unlike previous rule-based methods, our proposed approach does not require any predetermined features in the PPG signal. In this study, we used the raw morphology of PPG pulses to determine their quality levels. Thus, the aim of this study is to develop a novel SQI approach to access the quality of PPG pulses for improvement of SV measurement with a 2D DCNN. The PPG pulse and its differential pulse were segmented from the continuous PPG signal, and then they were merged into an image. Both the 2D deep residual neural network (DRNN) and 2D DCNN were used to determine the quality level of each PPG pulse. Fourteen healthy adult males participated in this experiment. The beat-to-beat SVs were measured When the heart is in the systolic phase, an amount of blood, which is called the stroke volume (SV), is pumped into systematic circulation. In this period, the volume of the thoracic cavity will change. Thus, impedance cardiography (ICG) was proposed by Kubicek et al., which is a noninvasive measurement technique to measure cardiac hemodynamic parameters [21]. In the SV calculation with ICG technology, LVET is one of the most important parameters, which is measured by the ICG signal. As the signal-to-noise ratio of the ICG signal is very low, the accuracy of LVET is not high enough. Liu et al. used a reflective PPG sensor placed on the neck to measure the LVET [5]. The LVETs measured by the PPG pulses were more accurate than those by the ICG pulses. Therefore, the accuracy of SV measured by their proposed technique was higher than the traditional ICG technique.
According to previous mentions, the higher the SQI of the PPG signal, the more accurate the physiological parameters measured by the PPG signal. Unlike previous rule-based methods, our proposed approach does not require any predetermined features in the PPG signal. In this study, we used the raw morphology of PPG pulses to determine their quality levels. Thus, the aim of this study is to develop a novel SQI approach to access the quality of PPG pulses for improvement of SV measurement with a 2D DCNN. The PPG pulse and its differential pulse were segmented from the continuous PPG signal, and then they were merged into an image. Both the 2D deep residual neural network (DRNN) and 2D DCNN were used to determine the quality level of each PPG pulse. Fourteen healthy adult males participated in this experiment. The beat-to-beat SVs were measured by our ICG device [5] and medis ® CS2000 for about three minutes, simultaneously. We utilized the error percentage of measured SVs to define the levels of SQI for each PPG pulse. The error percentage represents the accuracy of the SV measurements. The results showed that the 2D DRNN could assess the levels of SQI for the PPG pulses more easily than the rule-base method and the general DCNN.

Impedance Cardiography
In 1966, Kubicek et al. proposed an ICG, a noninvasive technique, to assess the continuous SV [16]. Equation (1) governs the ICG method, where r b is the blood resistivity that is assumed to be a constant value of 150 ohm × cm, L is the distance (cm) between two recording electrodes on the neck and chest, Z 0 is the base impedance (ohm) between the recording electrodes indicating initial thoracic cavity, and dZ/dt(max) is the absolute value (ohm/sec) of the maximum change of the ICG impedance signal. According to Equation (1), the SV has an absolute linear relationship with the LVET and dZ⁄dt(max). Figure 3 shows the synchronous ECG, ICG, and PPG signals, and the differential ICG (DICG) and differential PPG (DPPG) signals. In this figure, the PPG signal seems to be corrupted by fewer motion artifacts than the ICG signal [5].
The LVET is defined in the DPPG signal as the time interval between the first zero crossing point and the minimum point. The LVETs of heartbeats measured by high-quality PPG pulses would become more accurate.
where rb is the blood resistivity that is assumed to be a constant value of 150 ohm × cm, L is the distance (cm) between two recording electrodes on the neck and chest, Z0 is the base impedance (ohm) between the recording electrodes indicating initial thoracic cavity, and dZ⁄dt(max) is the absolute value (ohm/sec) of the maximum change of the ICG impedance signal. According to Equation (1), the SV has an absolute linear relationship with the LVET and dZ⁄dt(max). Figure 3 shows the synchronous ECG, ICG, and PPG signals, and the differential ICG (DICG) and differential PPG (DPPG) signals. In this figure, the PPG signal seems to be corrupted by fewer motion artifacts than the ICG signal [5]. The LVET is defined in the DPPG signal as the time interval between the first zero crossing point and the minimum point. The LVETs of heartbeats measured by high-quality PPG pulses would become more accurate.

Method
The overall procedure of the proposed SQI classification model for PPG pulses is shown in Figure 4. The original ICG and PPG signals are measured by our ICG device [5]. The input PPG and DPPG signals are segmented as the pulses by the zero-crossing points of the DPPG signal. Both the PPG and DPPG signals during one heart cycle are merged and transformed into images. These images are then used as the input to the 2D DCNN to perform the classification of the three SQI levels (high, middle, and low), which are defined by the error percentage of the measured SV as compared with the reference.

Method
The overall procedure of the proposed SQI classification model for PPG pulses is shown in Figure 4. The original ICG and PPG signals are measured by our ICG device [5]. The input PPG and DPPG signals are segmented as the pulses by the zero-crossing points of the DPPG signal. Both the PPG and DPPG signals during one heart cycle are merged and transformed into images. These images are then used as the input to the 2D DCNN to perform the classification of the three SQI levels (high, middle, and low), which are defined by the error percentage of the measured SV as compared with the reference.

Data Acquisition
Our ICG device was described in a previous study [5] in which analog ICG and PPG signals were all digitalized with a sampling frequency of 500 Hz. The PPG sensors were placed on the neck of the subject. The ICG and PPG signals were filtered to remove the baseline drift and the highfrequency noise using a second-order Butterworth bandpass filter in which the lower and upper cutoff frequencies were 0.2 and 10 Hz, respectively. Then, the DPPG and DICG signals were gotten from the PPG and ICG signals by the first-order discrete derivative, which passed a zero-phase forward and reverse second-order Butterworth lowpass filter. Its cutoff frequency was 10 Hz. The first zero crossing point for the DPPG signal during one heart cycle was used to segment the pulse. As the heart rates of the included subjects were not lower than 60 beats/minute, an image consisted of two pulses, PPG and DPPG, whose length (size) was set to 500 points. If the length of a pulse was less than 500 points, it was padded to become 500-points long with zero points. Figure 5 shows different 150 × 150 images obtained from the segmented PPG (blue line) and DPPG (orange line) signals with three different SQI levels. In Figure 5a, because the morphologies of the two PPG pulses within the systolic phase are perfect (i.e., the morphologies have a clear, distinct dicrotic notch and starting ejection point), their SQIs are high. The two PPG pulses in Figure 5b belong to the middle SQI ones, due to good morphology at the starting ejection point. However, their dicrotic notches are

Data Acquisition
Our ICG device was described in a previous study [5] in which analog ICG and PPG signals were all digitalized with a sampling frequency of 500 Hz. The PPG sensors were placed on the neck of the subject. The ICG and PPG signals were filtered to remove the baseline drift and the high-frequency noise using a second-order Butterworth bandpass filter in which the lower and upper cutoff frequencies were 0.2 and 10 Hz, respectively. Then, the DPPG and DICG signals were gotten from the PPG and ICG signals by the first-order discrete derivative, which passed a zero-phase forward and reverse second-order Butterworth lowpass filter. Its cutoff frequency was 10 Hz. The first zero crossing point for the DPPG signal during one heart cycle was used to segment the pulse. As the heart rates of the included subjects were not lower than 60 beats/minute, an image consisted of two pulses, PPG and DPPG, whose length (size) was set to 500 points. If the length of a pulse was less than 500 points, it was padded to become 500-points long with zero points. Figure 5 shows different 150 × 150 images obtained from the segmented PPG (blue line) and DPPG (orange line) signals with three different SQI levels. In Figure 5a, because the morphologies of the two PPG pulses within the systolic phase are perfect (i.e., the morphologies have a clear, distinct dicrotic notch and starting ejection point), their SQIs are high. The two PPG pulses in Figure 5b belong to the middle SQI ones, due to good morphology at the starting ejection point. However, their dicrotic notches are not distinct in the PPG signals. Therefore, the values of their differential signals at the dicrotic notch zone may not be larger Appl. Sci. 2020, 10, 4612 6 of 16 than zero. As shown in Figure 5c, the two PPG pulses own low SQIs since their amplitudes or baselines have been greatly distorted due to severe motion artifacts.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 6 of 16 not distinct in the PPG signals. Therefore, the values of their differential signals at the dicrotic notch zone may not be larger than zero. As shown in Figure 5c, the two PPG pulses own low SQIs since their amplitudes or baselines have been greatly distorted due to severe motion artifacts.

Network Architectures
Since the number of samples was not large and there were not many differences in the characteristics of patterns, 2D DCNNs were chosen to perform the classification task in the study. We built two 2D DCNNs based on the trained DRNN architecture with a 50-layer network (ResNet-50) [22] and the trained DCNN architecture with a 19-layer network (VGG-19) [23]. In the output layer, we replaced the 1000 fully-connected with softmax activation by a 1 fully-connected with sigmoid activation. The VGG-19 and ResNet-50 are the base models in this study that were pretrained for object detection tasks on the ImageNet dataset [24]. The architectures of the two 2D DCNNs are shown in Figure 6, with detailed descriptions shown in Tables 1 and 2, respectively. In Table 1, the filters in the VGG-19 all are of 3 × 3 size. The downsampling is performed directly by the maximum pooling layers that have a stride of 2, and batch normalization is performed right after each convolution and before ReLU activation. Two fully connected layers have sizes of 1024. For the ResNet-50, the main theme is to skip blocks of convolutional layers by using shortcut connections, as shown in Figure 6. The dot lines indicate that the dimensions of input and output are different. Thus, the 1 × 1 convolution with a stride of 2 is used to perform the projection shortcut. The solid lines represent that the dimensions of input and output are the same. Then, the identity shortcut is used. In Table 2, the filters in ResNet-50 follow two design rules. First, when the feature sizes of input and output are the same, the layers have the same number of filters. Second, when the feature map size is halved, the number of filters is doubled. The downsampling is performed directly by convolutional layers that have a stride of 2, and batch normalization is performed right after each convolution and before ReLU activation. The network ends with a global average pooling layer with a 7 × 7 filter. Since the number of samples was not large and there were not many differences in the characteristics of patterns, 2D DCNNs were chosen to perform the classification task in the study. We built two 2D DCNNs based on the trained DRNN architecture with a 50-layer network (ResNet-50) [22] and the trained DCNN architecture with a 19-layer network (VGG-19) [23]. In the output layer, we replaced the 1000 fully-connected with softmax activation by a 1 fully-connected with sigmoid activation. The VGG-19 and ResNet-50 are the base models in this study that were pretrained for object detection tasks on the ImageNet dataset [24]. The architectures of the two 2D DCNNs are shown in Figure 6, with detailed descriptions shown in Tables 1 and 2, respectively. In Table 1, the filters in the VGG-19 all are of 3 × 3 size. The downsampling is performed directly by the maximum pooling layers that have a stride of 2, and batch normalization is performed right after each convolution and before ReLU activation. Two fully connected layers have sizes of 1024. For the ResNet-50, the main theme is to skip blocks of convolutional layers by using shortcut connections, as shown in Figure 6. The dot lines indicate that the dimensions of input and output are different. Thus, the 1 × 1 convolution with a stride of 2 is used to perform the projection shortcut. The solid lines represent that the dimensions of input and output are the same. Then, the identity shortcut is used. In Table 2, the filters in ResNet-50 follow two design rules. First, when the feature sizes of input and output are the same, the layers have the same number of filters. Second, when the feature map size is halved, the number of filters is doubled. The downsampling is performed directly by convolutional layers that have a stride of 2, and batch normalization is performed right after each convolution and before ReLU activation. The network ends with a global average pooling layer with a 7 × 7 filter.

Experimental Protocol
This study recruited fourteen healthy male subjects without cardiovascular disease or injured limbs. Their ages were between 22 and 29 years (22.7 ± 2.1 years, mean ± standard deviation), weight between 46 and 78 Kg (61.8 ± 8.8 Kg), height between 165 and 188 cm (173.1 ± 6.1 cm), and heart rates between 65 and 78 beats/minute (70.5 ± 3.4 beats/minute). A commercial medical device (medis ® CS2000, medis, Germany) with the ICG technology was utilized to measure the beat-to-beat SV that was considered as the reference value in the study. This experiment was approved by the Research Ethics Committee of China Medical University and Hospital (No. CMUH107-REC3-061), Taichung, Taiwan.
The measurement duration for each subject lasted for three minutes. During the measurement, four electrodes of medis ® CS2000 were placed on the left side of the body. The other four electrodes of our designed ICG device were put at the right side of the body, and the PPG sensor was placed on the neck. The details of the measurement for the placement of those ICG electrodes were described in our previous study [12]. We recorded the beats of medis ® CS2000 and our ICG device synchronously. The data statistics are described as mean ± standard deviation (SD).

Statistical Analysis
In this study, PPG pulses are considered as high-quality when their error percentages of the SV measured by our ICG device and the medis ® CS2000 device are less than 18%. There were 1342 high-quality pulses. PPG pulses are considered as middle-quality when their error percentages are between 18% and 20%. There were 73 middle-quality pulses. PPG pulses are considered as low-quality when their error percentages are larger than 20%. There were 1720 low-quality pulses. Table 3 shows the three levels of SQI for all subjects. According to our proposed method, a PPG pulse is considered true-positive (TP) when its quality level is correctly identified, false-positive (FP) when its quality level is incorrectly identified, true-negative (TN) when its quality level is correctly rejected, and false-negative (FN) when its quality level is incorrectly rejected. Here, the performance of the proposed method was evaluated using accuracy, (TP + TN)/(TP + FP + FN + TN), precision, TP/(TP + FP), sensitivity, TP/(TP + FN), and specificity, TN/ (FN + TN).

Training Outcomes of Deep Convolution Neural Networks
The proposed VGG-19 and ResNet-50 were trained by 1200 PPG pulses that were divided into two categories, high-quality (d = 1) and low-quality (d = 0). The high-quality samples included 400 pulses randomly chosen from the 1342 samples, and the low-quality samples comprised 800 pulses randomly chosen from the 1720 samples. We did not use the pulses belonging to middle-quality to train the networks in this study because the sample number of this level was too few, only 73 pulses. In order to balance the sample numbers for the two levels, the high-quality samples were extended to 800 using the 400 samples. Figure 7a

Testing Outcomes of Deep Convolution Neural Networks
The testing samples included 1935 PPG pulses and did not overlap the training samples. The high-quality, middle-quality, and low-quality samples comprised 942, 73, and 920 PPG pulses, respectively. When the output value of the 2D DCNN was between 0.8 and 1.0, between 0.5 and 0.8, or between 0 and 0.5, the PPG pulse was classified as a high-quality, middle-quality, or low-quality, respectively. Table 4 shows the performance of the VGG-19 and ResNet-50 models in the classification of the high-and low-quality levels. The average accuracy (0.895) of the VGG-19 model is lower than that (0.925) of the ResNet-50 model. However, the sensitivity (0.970) and specificity (0.970) of the VGG-19 model are higher than those (0.915 and 920) of the ResNet-50 model, respectively. For all the testing data, the statistic error of SV is pretty high and found to be 33.5 ± 76.8 mL. Table 5 shows the statistic errors of SV for the three groups (high-quality, middle-quality, and low-quality), as classified by the VGG-19 and ResNet-50 models. With either of the two models, the high-quality group obviously resulted in the least SV errors. Additionally, the SV errors using the ResNet-50 model were lower than those using the VGG-19 model for the three groups with different quality levels.

Testing Outcomes of Deep Convolution Neural Networks
The testing samples included 1935 PPG pulses and did not overlap the training samples. The high-quality, middle-quality, and low-quality samples comprised 942, 73, and 920 PPG pulses, respectively. When the output value of the 2D DCNN was between 0.8 and 1.0, between 0.5 and 0.8, or between 0 and 0.5, the PPG pulse was classified as a high-quality, middle-quality, or low-quality, respectively. Table 4 shows the performance of the VGG-19 and ResNet-50 models in the classification of the high-and low-quality levels. The average accuracy (0.895) of the VGG-19 model is lower than that (0.925) of the ResNet-50 model. However, the sensitivity (0.970) and specificity (0.970) of the VGG-19 model are higher than those (0.915 and 920) of the ResNet-50 model, respectively. For all the testing data, the statistic error of SV is pretty high and found to be 33.5 ± 76.8 mL. Table 5 shows the statistic errors of SV for the three groups (high-quality, middle-quality, and low-quality), as classified by the VGG-19 and ResNet-50 models. With either of the two models, the high-quality group obviously resulted in the least SV errors. Additionally, the SV errors using the ResNet-50 model were lower than those using the VGG-19 model for the three groups with different quality levels.    Figure 8 shows the results of SQI classification with the ResNet-50 model for the PPG (blue line) and DPPG (orange line) signals moderately corrupted by the baseline drift. The SQI level of each pulse was determined according to the error percentage between the reference SV by medis ® CS2000 and the measured SV by our ICG device. An error percentage of below 18%, between 18% and 20%, or above 20% represents a high-quality, middle-quality, or low-quality PPG pulse, respectively. The first and third rows, and the second and fourth rows of the data correspond to the two SVs, and the two LVETs measured by medis ® CS 2000 and our ICG device, respectively. The fifth row of the data denotes the error percentage of the SV. The red line represents the output value of the ResNet-50 model. If the output value is larger than 0.8, between 0.5 and 0.8, and less than 0.5, then the PPG pulse will be classified as a high-, middle-and low-quality one, respectively. The cross and circle symbols denote the first zero-crossing point and minimum-value point of the DPPG pulse, respectively. For the seventh PPG pulse in the figure, it belongs to one of the PPG pulses with high quality because it has a sharp valley in the starting ejection zone and a clear dicrotic notch. Thus, its corresponding SV error percentage is found to be relatively low, 0.02, and the output value of the ResNet-50 model for this pulse is 1.0. In addition, the second and third pulses both belong to low SQI ones, although they have clear dicrotic notches and flat shape in the starting ejection zones. Since their LVET errors are 80 and 97 ms, their corresponding SV error percentages are found to be 0.49 and 0.42, respectively. Thus, two output values of the ResNet-50 model for these two pulses are both 0. For the fifth pulse, it belongs to a middle SQI one because it does not have a sharp valley in the starting ejection zone. Thus, its SV error percentage is 0.2, and the output value of the ResNet-50 model for this pulse is 0.6.   -quality group (N = 920) 64.6 ± 102.1 57.67 ± 95.4 Figure 8 shows the results of SQI classification with the ResNet-50 model for the PPG (blue line) and DPPG (orange line) signals moderately corrupted by the baseline drift. The SQI level of each pulse was determined according to the error percentage between the reference SV by medis ® CS2000 and the measured SV by our ICG device. An error percentage of below 18%, between 18% and 20%, or above 20% represents a high-quality, middle-quality, or low-quality PPG pulse, respectively. The first and third rows, and the second and fourth rows of the data correspond to the two SVs, and the two LVETs measured by medis ® CS 2000 and our ICG device, respectively. The fifth row of the data denotes the error percentage of the SV. The red line represents the output value of the ResNet-50 model. If the output value is larger than 0.8, between 0.5 and 0.8, and less than 0.5, then the PPG pulse will be classified as a high-, middle-and low-quality one, respectively. The cross and circle symbols denote the first zero-crossing point and minimum-value point of the DPPG pulse, respectively. For the seventh PPG pulse in the figure, it belongs to one of the PPG pulses with high quality because it has a sharp valley in the starting ejection zone and a clear dicrotic notch. Thus, its corresponding SV error percentage is found to be relatively low, 0.02, and the output value of the ResNet-50 model for this pulse is 1.0. In addition, the second and third pulses both belong to low SQI ones, although they have clear dicrotic notches and flat shape in the starting ejection zones. Since their LVET errors are 80 and 97 ms, their corresponding SV error percentages are found to be 0.49 and 0.42, respectively. Thus, two output values of the ResNet-50 model for these two pulses are both 0. For the fifth pulse, it belongs to a middle SQI one because it does not have a sharp valley in the starting ejection zone. Thus, its SV error percentage is 0.2, and the output value of the ResNet-50 model for this pulse is 0.6.  Figure 9 shows the results of SQI classification with the ResNet-50 model for the PPG (blue line) and DPPG (orange line) signals in the presence of serious baseline drift. When the baseline of the PPG pulses is heavily wandered, the proposed ResNet-50 can still successfully identify these pulses as low SQI ones. Thus, the output values of the ResNet-50 model for these pulses are all 0.
(orange line) signals moderately corrupted by the baseline drift. The first and third rows of the data are the two SVs with medis ® CS 2000 and our ICG device, respectively, while the second and fourth rows are the two LVETs with medis ® CS 2000 and our ICG device, respectively. The fifth row denotes the error percentages of SV. The red line is the output value of the ResNet-50 model. The cross and circle symbols represent the first zero-crossing point and minimum-value point of the DPPG pulse, respectively. Figure 9 shows the results of SQI classification with the ResNet-50 model for the PPG (blue line) and DPPG (orange line) signals in the presence of serious baseline drift. When the baseline of the PPG pulses is heavily wandered, the proposed ResNet-50 can still successfully identify these pulses as low SQI ones. Thus, the output values of the ResNet-50 model for these pulses are all 0.

Discussion
In a rule-based classification approach, only finite characteristics in time or frequency domains are extracted from a PPG pulse. Therefore, the performance of such a rule-based classification approach depends on the kind and number of characteristics. Since the number of the selected characteristics is always limited, all the information that exists in the PPG pulse is not fully utilized in the rule-based approaches [3,10,12]. Essentially, the main characteristics of a high-quality PPG pulse directly affect the measuring physiological parameter. In this study, LVET is defined as the time interval initiated at the opening of the aortic valve and terminated at the closing of the aortic valve. Thus, for the morphology of a PPG pulse, the starting ejection point is the first zero-crossing point of the DPPG pulse during systole. The ending ejection point is the time for the first minimum valley of the DPPG pulse during systole, which happens before the dicrotic notch. Thus, the clear foot and dicrotic notch are the main characteristics of the high-quality PPG pulse. In the 2D DCNN, the convolution layers can automatically classify the different feature patterns from the raw image. Thus, the performance of the 2D DCNN in this study is found to be better than that of our previous study using the rule-based method [12].

Discussion
In a rule-based classification approach, only finite characteristics in time or frequency domains are extracted from a PPG pulse. Therefore, the performance of such a rule-based classification approach depends on the kind and number of characteristics. Since the number of the selected characteristics is always limited, all the information that exists in the PPG pulse is not fully utilized in the rule-based approaches [3,10,12]. Essentially, the main characteristics of a high-quality PPG pulse directly affect the measuring physiological parameter. In this study, LVET is defined as the time interval initiated at the opening of the aortic valve and terminated at the closing of the aortic valve. Thus, for the morphology of a PPG pulse, the starting ejection point is the first zero-crossing point of the DPPG pulse during systole. The ending ejection point is the time for the first minimum valley of the DPPG pulse during systole, which happens before the dicrotic notch. Thus, the clear foot and dicrotic notch are the main characteristics of the high-quality PPG pulse. In the 2D DCNN, the convolution layers can automatically classify the different feature patterns from the raw image. Thus, the performance of the 2D DCNN in this study is found to be better than that of our previous study using the rule-based method [12].
In the previous study [5], we found that a substantial error is usually present in the LVET measured by the PPG or ICG, as compared with the standard reference measured by phonocardiography. Although the SV has a linear relation with the LVET according to Equation (1), the SV measured by medis ® CS2000 is calibrated through some parameters. In this study, both SV and LVET measured by medis ® CS2000 are used as the references to compare with those measured by our ICG device. In the study, one of our findings is that the application of high-quality PPG pulses leads to relatively lower errors in the SV and LVET measurement, as shown in Figures 8 and 9. Thus, only the PPG pulse with high quality can be used to obtain a reliable LVET and, subsequently, yield an accurate SV. In Table 5, the SV is measured by high-quality PPG pulses in which the statistic errors of SV for the VGG-19 and ResNet-50 models are found to be relatively low (4.5 ± 14.7 and 2.6 ± 14.2 mL), respectively.
In previous studies [10,25,26], the quality level of a PPG pulse was defined by experts in a manual fashion. However, in the validation of their algorithms, a direct comparison of performance between two published algorithms is restricted due to the different cognitive abilities of such experts. In this study, we use three error percentage degrees, below 18%, between 18% and 20%, and above 20%, to classify individual PPG pulse's SQI (low, middle, or high). Based on the quantitative degrees of error percentage, the proposed algorithm can effectively differentiate the quality level of each PPG pulse. Additionally, the accuracy in the SV measurement with a high SQI PPG pulse classified by the algorithm is found to be higher than that with a low SQI PPG pulse.
A classification approach using the DCNN does not need predetermined characteristics or features and makes full use of the information embedded in the PPG pulse by taking advantage of a deep learning process [27,28]. In our previous study, we proposed a rule-based method combined with a fuzzy neural network to determine the SQIs of PPG pulses [12]. In order to increase the tolerance of the rule-based method, a PPG pulse with an error percentage of SV less than 40% was considered to be of high quality. In the test data, the statistic error of PPG pulses classified to be of high quality was set 6.4 ± 12.8 mL. However, the accuracies for successfully determining high-and low-quality pulses achieved only 0.83 and 0.86, respectively. On the other hand, in the present work, we label a PPG pulse as high quality when its error percentage of SV is less than 20%. In the test data, the statistic error of pulses classified as high quality with the proposed ResNet-50 model is 2.6 ± 14.2 mL. The accuracies for successfully classifying high-and low-quality PPG pulses are 0.91 and 0.94, respectively. Since the performance of the proposed 2D DCNN approach for the SQI classification seems to be better than the rule-base method, the DCNN method may be applied to increase the measurement accuracy of SV.
Moreover, when the PPG signals are corrupted by serious baseline drift, these PPG pulses should be removed by some algorithms before classifying their SQIs using the rule-based method. In the study, the proposed 2D DCNN approaches (VGG-19 or ResNet-50) can make use of the morphologies of PPG and DPPG waveforms to determine their SQIs. The PPG and DPPG signals are first merged and transformed into an image, as shown in Figure 5, before we can use them to perform the classification task. As shown in Figure 5c, the image is constructed by the PPG and DPPG pulses in which the PPG pulse almost lacks the fundamental morphology of a traditional PPG waveform, but it can still be correctly classified as a low-quality one by the proposed ResNet-50 model ( Figure 9). This suggests that the proposed 2D DCNN approaches may be useful for quality classification of the PPG pulses, even for those seriously corrupted by motion artifacts and power line interference.
It is assumed that in a continuous PPG signal, the morphology of a high-quality PPG pulse may be gradually changed to a low-quality PPG one. Hence, a middle-quality pulse can be considered as a transitional one between the high-and low-quality pulses. In the present work, we define the error percentage of the measured SV with middle-quality pulses to be between 18% and 20%. Therefore, both the ResNet-50 and VGG-19 models are trained only using high-and low-quality PPG pulses, excluding the middle-quality ones. The output layers of the VGG-19 and ResNet-50 models use the sigmoid function as the active function. Thus, the current VGG-19 or ResNet-50 model can be considered as a regression model for determining the morphologic change of the PPG pulse. We define the output ranges of high-and low-quality pulses as between 1.0 and 0.8 and between 0.5 and 0.0, respectively. Of course, some testing pulses may be classified as the middle-quality ones when their testing outputs are between 0.8 and 0.5. Therefore, in Table 4, we test the performance of the VGG-19 or ResNet-50 model with the high-quality class and the not-high-quality class, and with the not-low-quality class and the low-quality class.
Although the ResNet-50 model is constructed by a trained DRNN architecture with a 50-layer network, its average accuracy (0.940) for classifying the high-plus middle-quality and the low-quality PPG pulses is higher than that (0.92) of the VGG-19 model with a 19-layer network. However, both sensitivity (0.915) and specificity (0.92) of the ResNet-50 model are lower than those (0.97, 0.97) of the VGG-19 model for classifying these two quality groups. It seems that due to few samples included in the study, no significant difference exists between the performances of the ResNet-50 and VGG-19 models.
There are some limitations to the present study. First, because the subjects recruited in this study all are healthy males, the pulses with the middle-or low-quality are all corrupted, mostly by the motion artifacts. In the study, we did not acquire the PPG pulses belonging to arrhythmic beats. Thus, gender and cardiovascular disease may somewhat affect the current results. Second, PPG pulse morphology would be varied with vascular compliance, which is closely associated with age and hypertension [29,30]. The ages of the included subjects are between 22 and 29 years, and their systolic and diastolic blood pressure are all in a normal range. Thus, subjects of different ages or with hypertension may have various pulse morphologies that may consequently influence the present outcome. Third, only 1-second episodes of PPG signals are employed in the current study. To make sure that each 1-second PPG signal contains at least one cardiac cycle data, the heart rates of the recruited participants must be higher than 60 beats/minute.

Conclusions
In order to quantify the level of SQI for each PPG pulse, the error percentage of measured SV for each beat was used to define the level of SQI for each PPG pulse. The morphologies of PPG and DPPG pulses were combined into an image, which was used to determine the quality level of each PPG pulse. The proposed VGG-19 and ResNet-50 models can be used to successfully determine the SQI of each PPG pulse. Thus, we did not need to explore the characteristics of the PPG pulse to determine the pulse SQI when using the 2D DRNN. Moreover, comparing with the results of our study, the performance of the 2D DRNN was better than that of a traditional rule-based method. Noticeably, the main limitation of the study is the small number of PPG pulses. If more PPG pulses are used in the training process of the 2D DRNN, better results can be expected.