Next Article in Journal
An Interacting Multiple Model Approach for Target Intent Estimation at Urban Intersection for Application to Automated Driving Vehicle
Next Article in Special Issue
A Practical Guide to Class IIa Medical Device Development
Previous Article in Journal
In Search for the Missing Nitrogen: Closing the Budget to Assess the Role of Denitrification in Agricultural Watersheds
 
 
Article
Peer-Review Record

Photoplethysmographic Prediction of the Ankle-Brachial Pressure Index through a Machine Learning Approach

Appl. Sci. 2020, 10(6), 2137; https://doi.org/10.3390/app10062137
by David Perpetuini 1,*, Antonio Maria Chiarelli 1, Daniela Cardone 1, Sergio Rinella 2, Simona Massimino 2, Francesco Bianco 3, Valentina Bucciarelli 3, Vincenzo Vinciguerra 4, Giorgio Fallica 4, Vincenzo Perciavalle 2,5, Sabina Gallina 1,3 and Arcangelo Merla 1
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3:
Reviewer 4: Anonymous
Appl. Sci. 2020, 10(6), 2137; https://doi.org/10.3390/app10062137
Submission received: 13 February 2020 / Revised: 14 March 2020 / Accepted: 19 March 2020 / Published: 21 March 2020

Round 1

Reviewer 1 Report

This paper is well written and interesting. It seems that estimating using PPG a pressure index work better than estimating pressure itself, and this makes intuitive sense.

Two minor comments upon methodology:
- the sampling frequency is 1 kHz, and the cut-off frequencies for the bandpass filtering of PPG are 0.2 and 10 Hz. Is this filtering effectively implemented, as these frequencies correspond to two very small normalized frequencies of 0.2/1000 and 10/1000? I suspect this would lead either to a filter that does not filter much, or to a filter with a very long impulse response.
- The PPG signal amplitude depends on a lot of factors, so is it really safe to use MaxArm and MaxAnkle as features? (but indeed only MaxAnkle is significant).

minor:

l. 23, "systolic blood pressures"
l. 29, "was investigated here"
l. 34, "and can identify"
l. 46, "systolic pressures"
l. 132, "employing the Enverdis"
l. 171, "The duration of the time-window"
l. 174, "for the evaluation of the average signal"
l. 220, "sample size" - ("numerosity" is descriptive)
l. 246, "an overall number of 24"
l. 279 "status. However,"l. 302, "sample size"

Author Response

Comments and Suggestions for Authors

This paper is well written and interesting. It seems that estimating using PPG a pressure index work better than estimating pressure itself, and this makes intuitive sense.

We thank the reviewer for the positive feedback.

Two minor comments upon methodology:
- the sampling frequency is 1 kHz, and the cut-off frequencies for the bandpass filtering of PPG are 0.2 and 10 Hz. Is this filtering effectively implemented, as these frequencies correspond to two very small normalized frequencies of 0.2/1000 and 10/1000? I suspect this would lead either to a filter that does not filter much, or to a filter with a very long impulse response.

We thank the reviewer for spotting this issue. We actually forgot to mention that the PPG signals were downsample by a factor of 10, down to 100 Hz, before filtering to avoid the problems mentioned by the reviewer. This is now added to the manuscript.

Refer to lines 166-167:

“Raw OD and ECG were downsampled by a factor of 10, down to 100 Hz, and filtered employing zero-lag, 4th order, band-pass Butterworth digital filters”.

 


- The PPG signal amplitude depends on a lot of factors, so is it really safe to use MaxArm and MaxAnkle as features? (but indeed only MaxAnkle is significant).

We agree with the reviewer that pulse amplitude tend to be unstable in PPG signal acquisitions. However, we indeed got a significant contribution of MaxAnkle in explaining ABI variance. The significance of MaxAnkle could depend on the fact that PPG signal acquired at the ankle might be more stable with respect to the arm. We now discuss this aspect in the manuscript.

Refer to lines 332-341:

‘Importantly, we obtained a significant contribution to the VE-ABI variance of a PPG temporal (TDAnkle) and amplitude feature (MaxAnkle) measured at the ankle. These results suggest that the ankle could be an ideal location for PPG measurements. PPG temporal features are associated to PWV, which can be accurately inferred from particularly distal measurement such as those performed at the ankle. For such a reason, temporal PPG features evaluated at the ankle might be expected to significantly contribute to ABI prediction. However, amplitude PPG features generally tends to be poorly reliable and are not useful for quantitative physiological parameters estimation. Nonetheless, significant ABI prediction capabilities were obtained for the MaxAnkle parameter, depicting the stability and reliability of measuring PPG at the ankle. The null results obtained for the MaxArm parameter, instead suggests less reliable estimation of PPG amplitude at the arm.’

 

minor:

  1. 23, "systolic blood pressures"
    l. 29, "was investigated here"
    l. 34, "and can identify"
    l. 46, "systolic pressures"
    l. 132, "employing the Enverdis"
    l. 171, "The duration of the time-window"
    l. 174, "for the evaluation of the average signal"
    l. 220, "sample size" - ("numerosity" is descriptive)
    l. 246, "an overall number of 24"
    l. 279 "status. However,"l. 302, "sample size"

We corrected the manuscript to follow all the reviewer suggestions of minor corrections.

Reviewer 2 Report

In this paper, the authors evaluated the feasibility of predicting ABI from PPG recordings by using General Linear Model and ROC curve. The descriptions are comprehensive, and the conclusion seems reasonable. My comments are as follow.

  1. Besides of the PPG, did the authors consider other confounders or covariates in the GLM model? e.g., age, BMI, and smoking status. Different features of PPG may have different prediction ability in patients with different age groups, BMI groups and smoking status.
  2. Did the authors consider separating the analysis by gender? Dose the PPG has the same prediction ability in male and female?
  3. Some descriptions in the introduction part should move to the method part, e.g., line 52-60, and line 109-114.

Author Response

Comments and Suggestions for Authors

In this paper, the authors evaluated the feasibility of predicting ABI from PPG recordings by using General Linear Model and ROC curve. The descriptions are comprehensive, and the conclusion seems reasonable. My comments are as follow.

  1. Besides of the PPG, did the authors consider other confounders or covariates in the GLM model? e.g., age, BMI, and smoking status. Different features of PPG may have different prediction ability in patients with different age groups, BMI groups and smoking status.

 

The more robust the method is to confounding covariates, the better it can be considered. Thus, the evaluation of the influence of such covariates it is not strictly fundamental to the study rationale. Nonetheless, the reviewer question is scientifically interesting. Thus, we firstly tested the statistical association among variables that we had available (e.g., age, gender and smoking status) and VE-ABI and, for gender, as suggested by the reviewer, we investigated if there were any differences in prediction ability of the algorithm on VE-ABI. This analysis is now reported in the manuscript.

 

Refer to sentences in line 126-127

‘Demographic and general clinical information were acquired. Specifically, information about age, body mass index (BMI), gender and smoking habit were gathered.’

 

Line 238-240:

‘Since there were significant gender differences in the VE-ABI, the same procedure was performed again by separating the total sample by gender and an assessment of the difference in performance was evaluated.’

 

Line 263-267:

‘Demographic and clinical variables were correlated with VE-ABI. For numerical variables, a small but significant correlation between age and VE-ABI (r=0.17, p=0.04) was found, whereas no significant correlation was found between BMI and VE-ABI (r=-0.02, p=0.83). For binary variables we found a significant difference in VE-ABI as a function of gender (male vs. female, t=2.25, df=144, p=0.03) and smoking habit (smokers vs. non-smokers, t=2.32, df=144, p=0.02).’

 

Line 277-281:

‘Importantly, the correlation of the multivariate estimation of PPG-ABI with VE-ABI (r=0.79) was statistically superior to the univariate correlations (vs. r=0.08, z=8.38, p=~0 for MaxArm; vs. r=0.18, z=7.52, p=~0 for MaxAnkle; vs. r=0.11, z=8.13, p=~0 for SlopeArm; vs. r=0.26, z=6.81, p=~0 for SlopeAnkle;  vs. r=-0.19, z=7.43, p=~0 for TDArm; vs. r=-0.35, z=5.97, p=~0 for TDAnkle; vs. r=0.62, z=2.93, p=2·10-3 for Ankle-Arm).‘

 

Line 285-290

‘Since statistical differences in VE-ABI as a function of gender were found and the sample numerosity in the male and female group was comparable, the cross-validated GLM approach was performed again by separating the two groups. This analysis was performed to assess if the prediction capabilities of the multivariate model was different with gender. The PPG-ABI vs. VE-ABI correlations obtained for the different groups were r=0.69 for males and r=0.61 for females; this difference in correlation was not statistically significant (z=0.82, p=0.21).’

 

Line 351-363

‘Importantly, small but significant differences in VE-ABI as a function of gender and smoking status were found. Since a sufficient similarity in male and female sample size was available, possible differences in the algorithm performance as a function of gender was performed. However, no statistical differences in the prediction performance was found. It should be noted that this finding might be driven by the decreased sample numerosity when separating the subjects in two groups. Since the smoking population was too small (only 17 smokers), no multivariate analysis as a function of smoking habit was performed. Further studies could be performed on this topic.

Anyhow, further studies should also be performed to increase the overall sample size of the population. In fact, machine learning approaches rely on data-driven analysis that might drastically increase their performances with large sample sizes. Increasing the sample size could decrease a possible in-sample overfitting effect of the regressor/classifier. In particular, it is fundamental to enroll more pathological participants in order to balance the numerosity of the classes and to perform a more reliable classification procedure.’

 

 

 

  1. Did the authors consider separating the analysis by gender? Dose the PPG has the same prediction ability in male and female?

 

We now report analysis assessing the gender effect:

 

Line 285-290

‘Since statistical differences in VE-ABI as a function of gender were found and the sample numerosity in the male and female group was comparable, the cross-validated GLM approach was performed again by separating the two groups. This analysis was performed to assess if the prediction capabilities of the multivariate model was different with gender. The PPG-ABI vs. VE-ABI correlations obtained for the different groups were r=0.69 for males and r=0.61 for females; this difference in correlation was not statistically significant (z=0.82, p=0.21). ’

 

Line 351-363

‘Importantly, small but significant differences in VE-ABI as a function of gender and smoking status were found. Since a sufficient similarity in male and female sample size was available, possible differences in the algorithm performance as a function of gender was performed. However, no statistical differences in the prediction performance was found. It should be noted that this finding might be driven by the decreased sample numerosity when separating the subjects in two groups. Since the smoking population was too small (only 17 smokers), no multivariate analysis as a function of smoking habit was performed. Further studies could be performed on this topic.

Anyhow, further studies should also be performed to increase the overall sample size of the population. In fact, machine learning approaches rely on data-driven analysis that might drastically increase their performances with large sample sizes. Increasing the sample size could decrease a possible in-sample overfitting effect of the regressor/classifier. In particular, it is fundamental to enroll more pathological participants in order to balance the numerosity of the classes and to perform a more reliable classification procedure.’

 

 

  1. Some descriptions in the introduction part should move to the method part, e.g., line 52-60, and line 109-114.

In the lines 52-60 we do not explain the procedure used in this manuscript but the gold-standard approach to estimate ABI with Doppler. Thus, we prefer to leave the section where it is. We moved the sentences in the lines 109-114 at the start of paragraph 2.5 within the Method section.

Reviewer 3 Report

In this manuscript, the feasibility of employing a machine learning approach of GLM in predicting ABI from 323 brachial and tibial PPG recordings was investigated. As described in lines 300-310 on page 10, however, "further studies should be performed to increase the sample size of the population 300 examined." as well as in lines 306-310 on page 10, it would be worthy to increase the number of regressors or to investigate more 306 complex non-linear machine learning approaches (such as Deep Learning, [47]) in ABI prediction 307 from PPG features. Therefore, it is necessary to extend the experimental results as follows.

  First, it is necessary to extend the experiments with appling conventional statistical analysis methods and compare their results with this GLM result. But it is optional to exploy additional machine learning methods such as DNN, SVM, and others. Depending on the extenson of experiments, the manuscript should be extended and revised in parts of Introduction, Figure 5 in Methods, and discussion.

  Second, expecially comparison of the ROC curves in Figure 5 on page 9 have to add the experimental resuts of previous statistical analyses. In addition, the authors need to compare additional experimental results in confusion matrix including accuracy, precision, recall(sensitivity), specificity, and F-score for previous statistical analyses and GLM. Finally, other parts in the manuscript should be revised, depending on the revisions.

The End.

Author Response

Comments and Suggestions for Authors

In this manuscript, the feasibility of employing a machine learning approach of GLM in predicting ABI from 323 brachial and tibial PPG recordings was investigated. As described in lines 300-310 on page 10, however, "further studies should be performed to increase the sample size of the population 300 examined." as well as in lines 306-310 on page 10, it would be worthy to increase the number of regressors or to investigate more 306 complex non-linear machine learning approaches (such as Deep Learning, [47]) in ABI prediction 307 from PPG features. Therefore, it is necessary to extend the experimental results as follows.

  First, it is necessary to extend the experiments with appling conventional statistical analysis methods and compare their results with this GLM result.

We already reported conventional statistical analysis for the best performing PPG feature. Following the reviewer suggestion, we now added the statistical analysis for all the other PPG parameters employed, together with other interesting parameters such as subjects age, BMI, smoking status gender.

Refer to sentences in line 126-127

‘Demographic and general clinical information were acquired. Specifically, information about age, body mass index (BMI), gender and smoking habit were gathered.’

 

Line 238-240:

‘Since there were significant gender differences in the VE-ABI, the same procedure was performed again by separating the total sample by gender and an assessment of the difference in performance was evaluated.’

 

Line 263-267:

‘Demographic and clinical variables were correlated with VE-ABI. For numerical variables, a small but significant correlation between age and VE-ABI (r=0.17, p=0.04) was found, whereas no significant correlation was found between BMI and VE-ABI (r=-0.02, p=0.83). For binary variables we found a significant difference in VE-ABI as a function of gender (male vs. female, t=2.25, df=144, p=0.03) and smoking habit (smokers vs. non-smokers, t=2.32, df=144, p=0.02).’

 

Line 277-281:

‘Importantly, the correlation of the multivariate estimation of PPG-ABI with VE-ABI (r=0.79) was statistically superior to the univariate correlations (vs. r=0.08, z=8.38, p=~0 for MaxArm; vs. r=0.18, z=7.52, p=~0 for MaxAnkle; vs. r=0.11, z=8.13, p=~0 for SlopeArm; vs. r=0.26, z=6.81, p=~0 for SlopeAnkle;  vs. r=-0.19, z=7.43, p=~0 for TDArm; vs. r=-0.35, z=5.97, p=~0 for TDAnkle; vs. r=0.62, z=2.93, p=2·10-3 for Ankle-Arm).‘

 

Line 285-290

‘Since statistical differences in VE-ABI as a function of gender were found and the sample numerosity in the male and female group was comparable, the cross-validated GLM approach was performed again by separating the two groups. This analysis was performed to assess if the prediction capabilities of the multivariate model was different with gender. The PPG-ABI vs. VE-ABI correlations obtained for the different groups were r=0.69 for males and r=0.61 for females; this difference in correlation was not statistically significant (z=0.82, p=0.21).’

 

Line 351-363

‘Importantly, small but significant differences in VE-ABI as a function of gender and smoking status were found. Since a sufficient similarity in male and female sample size was available, possible differences in the algorithm performance as a function of gender was performed. However, no statistical differences in the prediction performance was found. It should be noted that this finding might be driven by the decreased sample numerosity when separating the subjects in two groups. Since the smoking population was too small (only 17 smokers), no multivariate analysis as a function of smoking habit was performed. Further studies could be performed on this topic.

Anyhow, further studies should also be performed to increase the overall sample size of the population. In fact, machine learning approaches rely on data-driven analysis that might drastically increase their performances with large sample sizes. Increasing the sample size could decrease a possible in-sample overfitting effect of the regressor/classifier. In particular, it is fundamental to enroll more pathological participants in order to balance the numerosity of the classes and to perform a more reliable classification procedure.’

 

But it is optional to exploy additional machine learning methods such as DNN, SVM, and others.

As discussed in the discussion section, we preferred to not apply additional, possibly non-linear (DNN) approaches. These approaches might inherently increase the overfitting problem given the low sample size of the study.

Refer to the sentences at lines 364-368.

‘Moreover, it would be worth to increase the number of regressors or to investigate more complex non-linear machine learning approaches (such as Deep Learning, [47]) in ABI prediction from PPG features. However, both these solutions could introduce an overfitting effect in the learning procedure from the data which could be avoided by employing large sample sizes. Thus, further studies with substantial sample population are necessary to investigate this aspect.’

Depending on the extenson of experiments, the manuscript should be extended and revised in parts of Introduction, Figure 5 in Methods, and discussion.

We extended the manuscript with the requests of the reviewers, please refer to the responses to each question.

  Second, expecially comparison of the ROC curves in Figure 5 on page 9 have to add the experimental resuts of previous statistical analyses. In addition, the authors need to compare additional experimental results in confusion matrix including accuracy, precision, recall(sensitivity), specificity, and F-score for previous statistical analyses and GLM. Finally, other parts in the manuscript should be revised, depending on the revisions.

As requested by the reviewer, together with regression analysis, we now report AUC of the ROC for each of the monovariate classification using standalone PPG parameters. Statistical significances are also reported. We now report the confusion matrix information for the multivariate classification at a specific output variable threshold. Since AUC is a more general evaluation of classification performance, we prefer to not report the confusion matrices of each monovariate analysis.

Line 296-306:

‘At a specific threshold, an accuracy of 88%, a sensitivity of 75% and a Specificity 91% were obtained. The confusion matrix is reported in Table 2. The AUC retrieved with a multivariate approach (AUC=0.85) was significantly different [49] from the AUCs obtained employing univariate analysis from standalone PPG features (vs. AUC=0.46, z=4.80, p=~0 for MaxArm; vs. AUC =0.51, z=4.12, p=3.8·10-5 for MaxAnkle; vs. AUC=0.32, z=7.12, p=~0 for SlopeArm; vs. AUC=0.37, z=6.21, p=~0 for SlopeAnkle;  vs. AUC=0.36, z=6.38, p=~0 for TDArm; vs. AUC=0.35, z=6.56, p=~0 for TDAnkle; vs. AUC=0.52, z=3.99, p=6.7·10-5 for Ankle-Arm).’

Reference

Hanley, J.A.; McNeil, B.J. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 1982, 143, 29–36.

The End.

Reviewer 4 Report

This paper proposes a method for estimating ABI from PPG, which is practical and interesting. It is well organized and easy to read.
However, I think it needs some improvement.

Major
The authors use the maximum amplitude of the PPG ODs, but the maximum amplitude of the PPGs varies depending on the equipment and measurement conditions. That is, the maximum amplitude is not suitable for absolute evaluation. In order to use the maximum amplitude, some contrivance such as calibration (and/or calculation techniques) is required, but there is no description about it, so it is necessary to add it.

Minor
I think it is better to cite references from the journal (Applied Sciences) you submit.

Reference 35 in the References section, italic characters are not reflected (this may be just my environment).

 

 

 

Author Response

Comments and Suggestions for Authors

This paper proposes a method for estimating ABI from PPG, which is practical and interesting. It is well organized and easy to read. 
However, I think it needs some improvement.

Major
The authors use the maximum amplitude of the PPG ODs, but the maximum amplitude of the PPGs varies depending on the equipment and measurement conditions. That is, the maximum amplitude is not suitable for absolute evaluation. In order to use the maximum amplitude, some contrivance such as calibration (and/or calculation techniques) is required, but there is no description about it, so it is necessary to add it.

The amplitude of the PPG was expressed as ODs, which is a dimensionless scale and it is roughly proportional to the relative signal change for small relative signal change values (ln(1+x)≈x for small x). The PPG value, expressed in ODs, can indeed be suitable for further estimation, although it is not strictly associated to physiological quantities. We now report the equation defining the OD from a raw PPG signal. Refer to equation 1 at line 164. Indeed, we do agree with the reviewer that, in general, PPG amplitude estimation is not particularly stable, however we obtained significant prediction capabilities of PPG amplitude to VE-ABI on the ankle. This result suggests the high stability and reliability of PPG measurements performed at the ankle. Please refer to the manuscript for further discussion of the topic.

Refer to lines 332-341:

‘Importantly, a significant contribution to the VE-ABI variance of a PPG temporal (TDAnkle) and amplitude feature (MaxAnkle) measured at the ankle were obtained. These results suggest that the ankle could be an ideal location for PPG measurements. PPG temporal features are associated to PWV, which can be accurately inferred from particularly distal measurement such as those performed at the ankle. For such a reason, temporal PPG features evaluated at the ankle might be expected to significantly contribute to ABI prediction. However, amplitude PPG features generally tends to be poorly reliable and are not useful for quantitative physiological parameters estimation. Nonetheless, significant ABI prediction capabilities were obtained for the MaxAnkle parameter, depicting the stability and reliability of measuring PPG at the ankle. The null results obtained for the MaxArm parameter, instead suggests less reliable estimation of PPG amplitude at the arm.’

Minor
I think it is better to cite references from the journal (Applied Sciences) you submit.

Reference 35 in the References section, italic characters are not reflected (this may be just my environment).

We addressed the reviewer minor suggestion and fixed reference 35. We added the following references:

Georgieva-Tsaneva, G.; Gospodinova, E.; Gospodinov, M.; Cheshmedzhiev, K. Portable Sensor System for Registration, Processing and Mathematical Analysis of PPG Signals. Applied Sciences 2020, 10, 1051.

Liu, S.-H.; Wang, J.-J.; Chen, W.; Pan, K.-L.; Su, C.-H. Classification of Photoplethysmographic Signal     Quality with Fuzzy Neural Network for Improvement of Stroke Volume Measurement. Applied Sciences 2020, 10, 1476.

Round 2

Reviewer 2 Report

Please review and correct the language again.

Author Response

We want to thank the Reviewer for this advise. We corrected the English of the manuscript and it was also checked by an English speaker. 

Back to TopTop