A Comparative Analysis of Various Machine Learning Algorithms to Improve the Accuracy of HbA1c Estimation Using Wrist PPG Data
Abstract
:1. Introduction
2. Methodology
2.1. Hardware Device
2.2. Regression Models
2.2.1. XGBoost
2.2.2. Random Forest (RF)
2.2.3. CatBoost
2.2.4. LightGBM
2.3. Dataset Description
2.4. PPG Signal Processing
2.5. Correct Peak and Valley Detection for Determining AC and DC Value from PPG Signal
Algorithm 1: Pseudocode of determining AC and DC value from PPG signal. |
2.6. Feature Extraction
2.7. AC/DC Value as a Feature for Various Wavelengths
2.8. Importance-Based Feature Selection
3. Results and Discussion
3.1. Performance When We Use 47 Features (without AC/DC Values)
3.2. Performance When We Use 50 Features (Including AC/DC Features)
3.3. Performance after Applying Feature-Importance-Based Selection
4. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Saeedi, P.; Petersohn, I.; Salpea, P.; Malanda, B.; Karuranga, S.; Unwin, N.; Colagiuri, S.; Guariguata, L.; Motala, A.A.; Ogurtsova, K.; et al. Global and Regional Diabetes Prevalence Estimates for 2019 and Projections for 2030 and 2045: Results from the International Diabetes Federation Diabetes Atlas, 9th Edition. Diabetes Res. Clin. Pract. 2019, 157, 107843. [Google Scholar] [CrossRef] [PubMed]
- Lenters-Westra, E.; Schindhelm, R.K.; Bilo, H.J.; Slingerland, R.J. Haemoglobin A1c: Historical Overview and Current Concepts. Diabetes Res. Clin. Pract. 2013, 99, 75–84. [Google Scholar] [CrossRef] [PubMed]
- Sherwani, S.I.; Khan, H.A.; Ekhzaimy, A.; Masood, A.; Sakharkar, M.K. Significance of HbA1c Test in Diagnosis and Prognosis of Diabetic Patients. Biomark. Insights 2016, 11, BMI.S38440. [Google Scholar] [CrossRef] [PubMed]
- Little, R.R.; Roberts, W.L. A Review of Variant Hemoglobins Interfering with Hemoglobin A1c Measurement. J. Diabetes Sci. Technol. 2009, 3, 446–451. [Google Scholar] [CrossRef] [PubMed]
- Jain, G.; Joshi, A.M.; Maddila, R.K.; Vipparthi, S.K. A Review of Non-Invasive HbA1c and Blood Glucose Measurement Methods. In Proceedings of the 2021 IEEE International Symposium on Smart Electronic Systems (iSES), Jaipur, India, 18–22 December 2021; IEEE: Jaipur, India, 2021; pp. 339–342. [Google Scholar]
- Banik, P.P.; Hossain, S.; Kwon, T.-H.; Kim, H.; Kim, K.-D. Development of a Wearable Reflection-Type Pulse Oximeter System to Acquire Clean PPG Signals and Measure Pulse Rate and SpO2 with and without Finger Motion. Electronics 2020, 9, 1905. [Google Scholar] [CrossRef]
- Haque, C.A.; Kwon, T.-H.; Kim, K.-D. Cuffless Blood Pressure Estimation Based on Monte Carlo Simulation Using Photoplethysmography Signals. Sensors 2022, 22, 1175. [Google Scholar] [CrossRef] [PubMed]
- Haque, C.A.; Hossain, S.; Kwon, T.-H.; Kim, K.-D. Noninvasive In Vivo Estimation of Blood-Glucose Concentration by Monte Carlo Simulation. Sensors 2021, 21, 4918. [Google Scholar] [CrossRef] [PubMed]
- Hossain, S.; Gupta, S.S.; Kwon, T.-H.; Kim, K.-D. Derivation and Validation of Gray-Box Models to Estimate Noninvasive in-Vivo Percentage Glycated Hemoglobin Using Digital Volume Pulse Waveform. Sci. Rep. 2021, 11, 12169. [Google Scholar] [CrossRef] [PubMed]
- Hossain, S.; Kim, K.-D. Noninvasive Estimation of Glycated Hemoglobin In-Vivo Based on Photon Diffusion Theory and Genetic Symbolic Regression Models. IEEE Trans. Biomed. Eng. 2022, 69, 2053–2064. [Google Scholar] [CrossRef] [PubMed]
- Sen Gupta, S.; Kwon, T.-H.; Hossain, S.; Kim, K.-D. Towards Non-Invasive Blood Glucose Measurement Using Machine Learning: An All-Purpose PPG System Design. Biomed. Signal Process. Control 2021, 68, 102706. [Google Scholar] [CrossRef]
- Kwon, T.-H.; Kim, K.-D. Machine-Learning-Based Noninvasive In Vivo Estimation of HbA1c Using Photoplethysmography Signals. Sensors 2022, 22, 2963. [Google Scholar] [CrossRef] [PubMed]
- Hossain, S.; Kim, K.-D. Non-Invasive In Vivo Estimation of HbA1c Using Monte Carlo Photon Propagation Simulation: Application of Tissue-Segmented 3D MRI Stacks of the Fingertip and Wrist for Wearable Systems. Sensors 2023, 23, 540. [Google Scholar] [CrossRef] [PubMed]
- Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; ACM: San Francisco, CA, USA, 2016; pp. 785–794. [Google Scholar]
- Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
- Dorogush, A.V.; Ershov, V.; Gulin, A. CatBoost: Gradient Boosting with Categorical Features Support. arXiv 2018, arXiv:1810.11363. [Google Scholar] [CrossRef]
- Ke, G.; Meng, Q.; Finley, T.; Wang, T.; Chen, W.; Ma, W.; Ye, Q.; Liu, T.-Y. LightGBM: A Highly Efficient Gradient Boosting Decision Tree. In Proceedings of the Advances in Neural Information Processing Systems; Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R., Eds.; Curran Associates, Inc.: Red Hook, NY, USA, 2017; Volume 30. [Google Scholar]
- Sammut, C.; Webb, G.I. Encyclopedia of Machine Learning; Springer Science & Business Media: Berlin, Germany, 2011; ISBN 0-387-30768-0. [Google Scholar]
- Bergstra, J.; Bengio, Y. Random search for hyper-parameter optimization. J. Mach. Learn. Res. 2012, 13, 281–305. [Google Scholar]
- TMD3719 Datasheet. Available online: https://ams.com/documents/20143/9274753/TMD3719_DS000748_2-00.pdf (accessed on 21 January 2023).
- Kopitar, L.; Kocbek, P.; Cilar, L.; Sheikh, A.; Stiglic, G. Early Detection of Type 2 Diabetes Mellitus Using Machine Learning-Based Prediction Models. Sci. Rep. 2020, 10, 11981. [Google Scholar] [CrossRef] [PubMed]
- Afsaneh, E.; Sharifdini, A.; Ghazzaghi, H.; Ghobadi, M.Z. Recent Applications of Machine Learning and Deep Learning Models in the Prediction, Diagnosis, and Management of Diabetes: A Comprehensive Review. Diabetol. Metab. Syndr. 2022, 14, 196. [Google Scholar] [CrossRef] [PubMed]
- Bentéjac, C.; Csörgő, A.; Martínez-Muñoz, G. A Comparative Analysis of Gradient Boosting Algorithms. Artif. Intell. Rev. 2021, 54, 1937–1967. [Google Scholar] [CrossRef]
- STANDARD F200 Analyzer. Available online: https://www.sdbiosensor.com/product/product_view?product_no=179 (accessed on 12 May 2023).
- MD300C26 Fingertip Pulse Oximeter. Available online: http://www.choicemmed.com/product_center/501 (accessed on 12 May 2023).
- Bagal, T.; Bhole, K. Calibration of an Optical Sensor for in Vivo Blood Glucose Measurement. In Proceedings of the 2019 2nd International Conference on Intelligent Computing, Instrumentation and Control Technologies (ICICICT), Kannur, India, 5–6 July 2019; IEEE: Kannur, Kerala, India, 2019; pp. 1029–1032. [Google Scholar]
- Singha, S.K.; Ahmad, M.; Islam, M.R. Multiple Regression Analysis Based Non-Invasive Blood Glucose Level Estimation Using Photoplethysmography. In Proceedings of the 2021 International Conference on Automation, Control and Mechatronics for Industry 4.0 (ACMI), Rajshahi, Bangladesh, 8–9 July 2021; IEEE: Rajshahi, Bangladesh, 2021; pp. 1–5. [Google Scholar]
- Clarke, W.L.; Cox, D.; Gonder-Frederick, L.A.; Carter, W.; Pohl, S.L. Evaluating Clinical Accuracy of Systems for Self-Monitoring of Blood Glucose. Diabetes Care 1987, 10, 622–628. [Google Scholar] [CrossRef] [PubMed]
Measurement | BMI | SpO2 (%) | HbA1c |
---|---|---|---|
Min Max | 19.50 30.52 | 96 99 | 5.2 7.7 |
Mean ± SD | 25.53 ± 2.77 | 97.75 ± 0.79 | 6.10 ± 0.73 |
Combination of Wavelengths | RF | XGBoost | LightGBM | CatBoost |
---|---|---|---|---|
R | 0.273 | 0.559 | 0.593 | 0.514 |
G | 0.735 | 0.786 | 0.808 | 0.796 |
B | 0.752 | 0.767 | 0.770 | 0.710 |
RG | 0.744 | 0.784 | 0.815 | 0.760 |
RB | 0.731 | 0.711 | 0.761 | 0.713 |
GB | 0.794 | 0.793 | 0.826 | 0.780 |
RGB | 0.803 | 0.796 | 0.822 | 0.766 |
Combination of Wavelengths | RF | XGBoost | LightGBM | CatBoost |
---|---|---|---|---|
R | 0.748 | 0.789 | 0.856 | 0.827 |
G | 0.744 | 0.871 | 0.886 | 0.857 |
B | 0.895 | 0.884 | 0.899 | 0.878 |
RG | 0.747 | 0.867 | 0.833 | 0.858 |
RB | 0.896 | 0.899 | 0.910 | 0.908 |
GB | 0.876 | 0.879 | 0.914 | 0.905 |
RGB | 0.914 | 0.904 | 0.925 | 0.917 |
Algorithm | Red | Green | Blue | Demographic Features |
---|---|---|---|---|
RF | Red AC/DC (4), PSD variance (12), autocorrelation (18), sum of absolute difference (19), mean KTE (20) | Sum of absolute difference (3), green AC/DC (6), PSD variance (7), autocorrelation (8), PSD Kurtosis (9) | Blue AC/DC (1), PSD variance (10), sum of absolute difference (11), zero-crossing rate (13), autocorrelation (14) | BMI (2), SpO2 (5) |
XGBoost | Red AC/DC (8), mean PSD (23), mean KTE (24), PSD kurtosis (25), PSD variance (26) | PSD variance (1), Mean PSD (2), autocorrelation (3), sum of absolute difference (9), green AC/DC (11) | PSD variance (4), blue AC/DC (5), autocorrelation (6), sum of absolute difference (7), zero-crossing rate (10) | BMI (12), SpO2 (13) |
LightGBM | Red AC/DC (3), mean KTE (22), autocorrelation (23), mean PSD (24), mean absolute wavelet (28) | Green AC/DC (4), zero-crossing rate (5), PSD kurtosis (8), PSD variance (10), sum of absolute difference (11) | Blue AC/DC (1), sum of absolute difference (7), autocorrelation (9) mean PSD (12), KTE variance (17) | BMI (2), SpO2 (6) |
CatBoost | Red AC/DC (4), sum of absolute difference (15), PSD Kurtosis (17), mean KTE (18), mean PSD (23) | green AC/DC (3), mean PSD (6), PSD variance (7), autocorrelation (8), sum of absolute difference (10) | Blue AC/DC (1), sum of absolute difference (9), PSD kurtosis (12), autocorrelation (14), mean PSD (16) | BMI (2), SpO2(5) |
Combination of Wavelengths | RF | XGBoost | LightGBM | CatBoost |
---|---|---|---|---|
R | 0.736 | 0.766 | 0.859 | 0.831 |
G | 0.725 | 0.891 | 0.887 | 0.832 |
B | 0.918 | 0.881 | 0.906 | 0.857 |
RG | 0.829 | 0.901 | 0.927 | 0.890 |
RB | 0.924 | 0.901 | 0.890 | 0.888 |
GB | 0.914 | 0.896 | 0.920 | 0.910 |
RGB | 0.925 | 0.906 | 0.941 | 0.921 |
Combination of Wavelengths | RF | XGBoost | LightGBM | CatBoost |
---|---|---|---|---|
R | 0.644 | 0.758 | 0.788 | 0.678 |
G | 0.744 | 0.853 | 0.644 | 0.825 |
B | 0.873 | 0.826 | 0.864 | 0.814 |
RG | 0.786 | 0.861 | 0.858 | 0.769 |
RB | 0.876 | 0.854 | 0.889 | 0.801 |
GB | 0.909 | 0.870 | 0.896 | 0.882 |
RGB | 0.914 | 0.890 | 0.929 | 0.890 |
Metrics | RF | XGBoost | LightGBM | CatBoost | ||||
---|---|---|---|---|---|---|---|---|
w/External Features | w/o External Features | w/External Features | w/o External Features | w/External Features | w/o External Features | w/External Features | w/o External Features | |
MSE | 0.074 | 0.085 | 0.093 | 0.109 | 0.061 | 0.074 | 0.076 | 0.141 |
ME | −0.012 | −0.073 | 0.024 | 0.035 | −0.004 | 0.009 | 0.014 | 0.048 |
RMSE | 0.271 | 0.292 | 0.305 | 0.329 | 0.246 | 0.272 | 0.277 | 0.375 |
Score | 0.856 | 0.834 | 0.818 | 0.787 | 0.881 | 0.861 | 0.850 | 0.725 |
Type | Zone | ||
---|---|---|---|
A | B | C | |
RF | 86% (19) | 14% (3) | 0% (0) |
XGBoost | 91% (20) | 9% (2) | 0% (0) |
LightGBM | 100% (22) | 0% (0) | 0% (0) |
CatBoost | 95% (21) | 5% (1) | 0% (0) |
Algorithm | Bias | 95% Limit of Agreement (1.96 STD) |
---|---|---|
RF | −0.075 ± 0.296 | −0.66 to 0.51 |
XGBoost | 0.001 ± 0.278 | −0.62 to 0.62 |
LightGBM | 0.001 ± 0.252 | −0.49 to 0.49 |
CatBoost | 0.001 ±0.297 | −0.58 to 0.58 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Satter, S.; Kwon, T.-H.; Kim, K.-D. A Comparative Analysis of Various Machine Learning Algorithms to Improve the Accuracy of HbA1c Estimation Using Wrist PPG Data. Sensors 2023, 23, 7231. https://doi.org/10.3390/s23167231
Satter S, Kwon T-H, Kim K-D. A Comparative Analysis of Various Machine Learning Algorithms to Improve the Accuracy of HbA1c Estimation Using Wrist PPG Data. Sensors. 2023; 23(16):7231. https://doi.org/10.3390/s23167231
Chicago/Turabian StyleSatter, Shama, Tae-Ho Kwon, and Ki-Doo Kim. 2023. "A Comparative Analysis of Various Machine Learning Algorithms to Improve the Accuracy of HbA1c Estimation Using Wrist PPG Data" Sensors 23, no. 16: 7231. https://doi.org/10.3390/s23167231
APA StyleSatter, S., Kwon, T.-H., & Kim, K.-D. (2023). A Comparative Analysis of Various Machine Learning Algorithms to Improve the Accuracy of HbA1c Estimation Using Wrist PPG Data. Sensors, 23(16), 7231. https://doi.org/10.3390/s23167231