Individualized Prediction of Blood Glucose Outcomes Using Compositional Data Analysis
Abstract
:1. Introduction
2. Materials and Methods
2.1. Dataset
Data Preprocessing
2.2. CoDa
2.3. Regression Model with CoDa
2.3.1. LRM with Compositional Predictor and Scalar Response
Steps for the Creation of the Model Based on CoDa
- The predictor is represented in -coordinates (Equation (6)). The compositions are, by definition, multivariate and therefore must be mapped in some way, linear or non-linear, to a single number. To compute such a regression model, the principle of working in coordinates is used to transform the model into a multiple regression problem. The -coordinate , of a composition x with respect to a base linked to a SBP, is calculated as Equation (7).
- The ordinary regression model is solved to obtain the coefficients with Equation (8) for .
2.3.2. Data Preprocessing
2.4. Prediction of Minimum and Maximum Glucose
2.5. Confusion Matrix—Metrics for Multi-Class Classification
3. Results
3.1. Overall LRM Test Results
3.2. Validation of the Multivariable LRM of Mean and CV Prediction
3.3. Application, Example of the “Traffic Light” Proposed for a Specific Patient
Patient 1, Day 3 Characterized by High Variability
3.4. Results of the Metrics for Multi-Class Classification
3.4.1. Accuracy Results
3.4.2. Balanced Accuracy and Balanced Accuracy Weighted Results
3.4.3. Sensitivity Results
3.4.4. F1-Score
3.4.5. Matthews Correlation Coefficient for Multi-Class Classification
4. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
BA | Balanced accuracy |
BG | Blood glucose |
BGV | Blood glucose variation |
CSII | Continuous subcutaneous insulin infusion |
CGM | Continuous glucose monitoring |
CoDa | Compositional data |
clr | Centered log-ratio |
ilr | Isometric log-ratio |
MDI | Multiple daily injections |
SBP | Sequential binary partition |
T1D | Type 1 diabetes |
TIR | Time in range |
References
- Silva, J.A.d.; Souza, E.C.F.d.; Echazú Böschemeier, A.G.; Costa, C.C.M.d.; Bezerra, H.S.; Feitosa, E.E.L.C. Diagnosis of diabetes mellitus and living with a chronic condition: Participatory study. BMC Public Health 2018, 18, 699. [Google Scholar] [CrossRef] [PubMed]
- Contreras, I.; Vehi, J. Artificial intelligence for diabetes management and decision support: Literature review. J. Med Internet Res. 2018, 20, e10775. [Google Scholar] [CrossRef] [PubMed]
- Mohebbi, A.; Johansen, A.R.; Hansen, N.; Christensen, P.E.; Tarp, J.M.; Jensen, M.L.; Bengtsson, H.; Mørup, M. Short term blood glucose prediction based on continuous glucose monitoring data. In Proceedings of the 2020 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), Montreal, QC, Canada, 20–24 July 2020; pp. 5140–5145. [Google Scholar]
- Martinsson, J.; Schliep, A.; Eliasson, B.; Mogren, O. Blood glucose prediction with variance estimation using recurrent neural networks. J. Healthc. Inform. Res. 2020, 4, 1–18. [Google Scholar] [CrossRef] [PubMed]
- Daniels, J.; Herrero, P.; Georgiou, P. A multitask learning approach to personalized blood glucose prediction. IEEE J. Biomed. Health Inform. 2021, 26, 436–445. [Google Scholar] [CrossRef] [PubMed]
- Tena, F.; Garnica, O.; Lanchares, J.; Hidalgo, J.I. Ensemble Models of Cutting-Edge Deep Neural Networks for Blood Glucose Prediction in Patients with Diabetes. Sensors 2021, 21, 7090. [Google Scholar] [CrossRef]
- Cichosz, S.L.; Kronborg, T.; Jensen, M.H.; Hejlesen, O. Penalty weighted glucose prediction models could lead to better clinically usage. Comput. Biol. Med. 2021, 138, 104865. [Google Scholar] [CrossRef]
- Wadghiri, M.; Idri, A.; El Idrissi, T.; Hakkoum, H. Ensemble blood glucose prediction in diabetes mellitus: A review. Comput. Biol. Med. 2022, 147, 105674. [Google Scholar] [CrossRef]
- Woldaregay, A.Z.; Årsand, E.; Walderhaug, S.; Albers, D.; Mamykina, L.; Botsis, T.; Hartvigsen, G. Data-driven modeling and prediction of blood glucose dynamics: Machine learning applications in type 1 diabetes. Artif. Intell. Med. 2019, 98, 109–134. [Google Scholar] [CrossRef]
- Sun, X.; Rashid, M.; Askari, M.R.; Cinar, A. Latent Variables Model Based MPC for People with Type 1 Diabetes. IFAC-PapersOnLine 2021, 54, 294–299. [Google Scholar] [CrossRef]
- Henao-Carrillo, D.C.; Muñoz, O.M.; Gómez, A.M.; Rondón, M.; Colón, C.; Chica, L.; Rubio, C.; León-Vargas, F.; Calvachi, M.A.; Perea, A.M. Reduction of glycemic variability with Degludec insulin in patients with unstable diabetes. J. Clin. Transl. Endocrinol. 2018, 12, 8–12. [Google Scholar] [CrossRef]
- Kovatchev, B. Glycemic variability: Risk factors, assessment, and control. J. Diabetes Sci. Technol. 2019, 13, 627–635. [Google Scholar] [CrossRef] [PubMed]
- ElSayed, N.A.; Aleppo, G.; Aroda, V.R.; Bannuru, R.R.; Brown, F.M.; Bruemmer, D.; Collins, B.S.; Hilliard, M.E.; Isaacs, D.; Johnson, E.L.; et al. 6. Glycemic Targets: Standards of Care in Diabetes 2023. Diabetes Care 2022, 46, S97–S110. [Google Scholar] [CrossRef] [PubMed]
- ElSayed, N.A.; Aleppo, G.; Aroda, V.R.; Bannuru, R.R.; Brown, F.M.; Bruemmer, D.; Collins, B.S.; Cusi, K.; Das, S.R.; Gibbons, C.H.; et al. Introduction and Methodology: Standards of Care in Diabetes 2023; American Diabetes Association: Arlington, VA, USA, 2023. [Google Scholar]
- Biagi, L.; Bertachi, A.; Giménez, M.; Conget, I.; Bondia, J.; Martín-Fernández, J.A.; Vehí, J. Individual categorisation of glucose profiles using compositional data analysis. Stat. Methods Med Res. 2019, 28, 3550–3567. [Google Scholar] [CrossRef] [PubMed]
- Biagi, L.; Bertachi, A.; Giménez, M.; Conget, I.; Bondia, J.; Martín-Fernández, J.A.; Vehí, J. Probabilistic Model of Transition between Categories of Glucose Profiles in Patients with Type 1 Diabetes Using a Compositional Data Analysis Approach. Sensors 2021, 21, 3593. [Google Scholar] [CrossRef] [PubMed]
- Cabrera, A.; Biagi, L.; Beneyto, A.; Estremera, E.; Contreras, I.; Giménez, M.; Conget, I.; Bondia, J.; Martín-Fernández, J.A.; Vehí, J. Validation of a Probabilistic Prediction Model for Patients with Type 1 Diabetes Using Compositional Data Analysis. Mathematics 2023, 11, 1241. [Google Scholar] [CrossRef]
- Vigersky, R.A.; McMahon, C. The relationship of hemoglobin A1C to time-in-range in patients with diabetes. Diabetes Technol. Ther. 2019, 21, 81–85. [Google Scholar] [CrossRef]
- Vettoretti, M.; Cappon, G.; Facchinetti, A.; Sparacino, G. Advanced diabetes management using artificial intelligence and continuous glucose monitoring sensors. Sensors 2020, 20, 3870. [Google Scholar] [CrossRef]
- Noaro, G.; Cappon, G.; Vettoretti, M.; Sparacino, G.; Del Favero, S.; Facchinetti, A. Machine-learning based model to improve insulin bolus calculation in type 1 diabetes therapy. IEEE Trans. Biomed. Eng. 2020, 68, 247–255. [Google Scholar] [CrossRef]
- Khanam, J.J.; Foo, S.Y. A comparison of machine learning algorithms for diabetes prediction. ICT Express 2021, 7, 432–439. [Google Scholar] [CrossRef]
- Makroum, M.A.; Adda, M.; Bouzouane, A.; Ibrahim, H. Machine learning and smart devices for diabetes management: Systematic review. Sensors 2022, 22, 1843. [Google Scholar] [CrossRef]
- Aitchison, J. The statistical analysis of compositional data. J. R. Stat. Soc. Ser. B 1982, 44, 139–160. [Google Scholar] [CrossRef]
- Egozcue, J.J.; Pawlowsky-Glahn, V.; Mateu-Figueras, G.; Barcelo-Vidal, C. Isometric logratio transformations for compositional data analysis. Math. Geol. 2003, 35, 279–300. [Google Scholar] [CrossRef]
- Egozcue, J.J.; Daunis-I-Estadella, J.; Pawlowsky-Glahn, V.; Hron, K.; Filzmoser, P. Simplicial Regression. The Normal Model. J. Appl. Probab. Stat. 2012, 6, 87–108. [Google Scholar]
- Van den Boogaart, K.G.; Tolosana-Delgado, R. Analyzing Compositional Data with R; Springer: Berlin/Heidelberg, Germany, 2013; Volume 122. [Google Scholar]
- Pawlowsky-Glahn, V.; Egozcue, J.J.; Tolosana-Delgado, R. Modeling and Analysis of Compositional Data; John Wiley & Sons: Hoboken, NJ, USA, 2015. [Google Scholar]
- Thió i Fernández de Henestrosa, S.; Martín Fernández, J.A. Proceedings of the 6th International Workshop on Compositional Data Analysis: Girona, Spain, 1–7 de juny de 2015; Departament d’Informàtica, Matemàtica Aplicada, Universitat de Girona: Girona, Spain, 2015. [Google Scholar]
- Fišerová, E.; Donevska, S.; Hron, K.; Bábek, O.; Vaňkátová, K. Practical aspects of log-ratio coordinate representations in regression with compositional response. Meas. Sci. Rev. 2016, 16, 235–243. [Google Scholar] [CrossRef]
- Ståhl, F.; Johansson, R. Diabetes mellitus modeling and short-term prediction based on blood glucose measurements. Math. Biosci. 2009, 217, 101–117. [Google Scholar] [CrossRef]
- Mhaskar, H.N.; Pereverzyev, S.V.; Van der Walt, M.D. A deep learning approach to diabetic blood glucose prediction. Front. Appl. Math. Stat. 2017, 3, 14. [Google Scholar] [CrossRef]
- Rodríguez-Rodríguez, I.; Chatzigiannakis, I.; Rodríguez, J.V.; Maranghi, M.; Gentili, M.; Zamora-Izquierdo, M.Á. Utility of big data in predicting short-term blood glucose levels in type 1 diabetes mellitus through machine learning techniques. Sensors 2019, 19, 4482. [Google Scholar] [CrossRef]
- Katsarou, D.N.; Georga, E.I.; Christou, M.; Tigas, S.; Papaloukas, C.; Fotiadis, D.I. Short Term Glucose Prediction in Patients with Type 1 Diabetes Mellitus. In Proceedings of the 2022 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), Glasgow, Scotland, 11–15 July 2022; pp. 329–332. [Google Scholar]
- Aleppo, G.; Ruedy, K.J.; Riddlesworth, T.D.; Kruger, D.F.; Peters, A.L.; Hirsch, I.; Bergenstal, R.M.; Toschi, E.; Ahmann, A.J.; Shah, V.N.; et al. REPLACE-BG: A randomized trial comparing continuous glucose monitoring with and without routine blood glucose monitoring in adults with well-controlled type 1 diabetes. Diabetes Care 2017, 40, 538–545. [Google Scholar] [CrossRef]
- Martín-Fernández, J.A. Comments on: Compositional data: The sample space and its structure. Test 2019, 28, 653–657. [Google Scholar] [CrossRef]
- Hron, K.; Filzmoser, P.; Thompson, K. Linear regression with compositional explanatory variables. J. Appl. Stat. 2012, 39, 1115–1128. [Google Scholar] [CrossRef]
- Navarro-Lopez, C.; Linares-Mustaros, S.; Mulet-Forteza, C. The Statistical Analysis of Compositional Data by John Aitchison (1986): A Bibliometric Overview. SAGE Open 2022, 12, 21582440221093366. [Google Scholar] [CrossRef]
- Coenders, G.; Pawlowsky-Glahn, V. On interpretations of tests and effect sizes in regression models with a compositional predictor. SORT-Stat. Oper. Res. Trans. 2020, 44, 201–220. [Google Scholar]
- Martín-Fernández, J.A.; Barceló-Vidal, C.; Pawlowsky-Glahn, V. Dealing with zeros and missing values in compositional data sets using nonparametric imputation. Math. Geol. 2003, 35, 253–278. [Google Scholar] [CrossRef]
- Palarea-Albaladejo, J.; Martín-Fernández, J.A. zCompositions R package for multivariate imputation of left-censored data under a compositional approach. Chemom. Intell. Lab. Syst. 2015, 143, 85–96. [Google Scholar] [CrossRef]
- Martín-Fernández, J.A.; Hron, K.; Templ, M.; Filzmoser, P.; Palarea-Albaladejo, J. Bayesian-multiplicative treatment of count zeros in compositional data sets. Stat. Model. 2015, 15, 134–158. [Google Scholar] [CrossRef]
- Gulhar, M.; Kibria, B.G.; Albatineh, A.N.; Ahmed, N.U. A comparison of some confidence intervals for estimating the population coefficient of variation: A simulation study. SORT-Stat. Oper. Res. Trans. 2012, 36, 45–68. [Google Scholar]
- Rácz, A.; Bajusz, D.; Héberger, K. Multi-level comparison of machine learning classifiers and their performance metrics. Molecules 2019, 24, 2811. [Google Scholar] [CrossRef]
- Grandini, M.; Bagli, E.; Visani, G. Metrics for multi-class classification: An overview. arXiv 2020, arXiv:2008.05756. [Google Scholar]
- Tharwat, A. Classification assessment methods. Appl. Comput. Inform. 2021, 17, 168–192. [Google Scholar] [CrossRef]
- Evans, M.; Morgan, A.R.; Patel, D.; Dhatariya, K.; Greenwood, S.; Newland-Jones, P.; Hicks, D.; Yousef, Z.; Moore, J.; Kelly, B.; et al. Risk prediction of the diabetes missing million: Identifying individuals at high risk of diabetes and related complications. Diabetes Ther. 2021, 12, 87–105. [Google Scholar] [CrossRef]
- Garden, G.L.; Frier, B.M.; Hine, J.L.; Hutchison, E.J.; Mitchell, S.J.; Shaw, K.M.; Heller, S.R.; Koehler, G.; Hofmann, V.; Gaffney, T.P.; et al. Blood glucose monitoring by insulin-treated pilots of commercial and private aircraft: An analysis of out-of-range values. Diabetes Obes. Metab. 2021, 23, 2303–2310. [Google Scholar] [CrossRef] [PubMed]
- Vehí, J.; Contreras, I.; Oviedo, S.; Biagi, L.; Bertachi, A. Prediction and prevention of hypoglycaemic events in type-1 diabetic patients using machine learning. Health Inform. J. 2020, 26, 703–718. [Google Scholar] [CrossRef] [PubMed]
- Parcerisas, A.; Contreras, I.; Delecourt, A.; Bertachi, A.; Beneyto, A.; Conget, I.; Viñals, C.; Giménez, M.; Vehi, J. A machine learning approach to minimize nocturnal hypoglycemic events in type 1 diabetic patients under multiple doses of insulin. Sensors 2022, 22, 1665. [Google Scholar] [CrossRef] [PubMed]
- De Bois, M.; Ammi, M.; El Yacoubi, M.A. Model fusion to enhance the clinical acceptability of long-term glucose predictions. In Proceedings of the 2019 IEEE 19th International Conference on Bioinformatics and Bioengineering (BIBE), Athens, Greece, 28–30 October 2019; pp. 258–264. [Google Scholar]
i | |||||
---|---|---|---|---|---|
1 | +1 | +1 | −1 | 2 | 1 |
2 | +1 | −1 | 0 | 1 | 1 |
Consecutive Zeros | Position 1 | Position 2 |
---|---|---|
For 2 h, 5/120 = 0.04166 | ||
1 | dL = 0.04166 | |
2 | dL/3 = 0.01388 | 2 dL/3 = 0.0277 |
For 4 h, 5/240 = 0.02083 | ||
1 | dL = 0.02083 | |
2 | dL/3 = 0.00694 | 2 dL/3 = 0.01388 |
Metrics | Equation |
---|---|
Accuracy metric accounts for the correct classifications (TP and TN) and incorrect classifications in the confusion matrix. | |
Balanced Accuracy (BA) calculates the average recall for each true class, considering class imbalances to provide a fair assessment of model performance across all classes. | |
BA Weighted (BAW) leverages the BA formula by incorporating class weights, determined by class frequencies in the dataset. This enables the monitoring of algorithm performance for individual classes and highlights the impact of each class based on its frequency. | |
Precision | |
Recall | |
Macro Average Precision (MaAP) | |
Macro Average Recall (MaAR) | |
Micro F1-Score | |
Macro F1- Score | |
Macro F1- Score | |
Micro Average Precision (MiAP) | |
Micro Average Recall (MiAR) | |
Matthews Correlation Coefficient (MCC) where K is the number of classes, is the number of samples correctly classified in class i, and is the number of samples that were classified as i but belong to class j. The numerator of the formula represents the covariance between the predictions and the true labels, whereas the denominator is a normalization to bring the result in the range [−1, 1]. |
Predicted Data | ||||
Class | A | B | C | |
Real Data | A | AA | AB | AC |
B | BA | BB | BC | |
C | CA | CB | CC |
3 Class | 5 Class | ||||
---|---|---|---|---|---|
Hour | Predicted State | Real State | Predicted State | Real State | Characteristics of the States |
00:00 | 3 class | ||||
01:00 | |||||
02:00 | |||||
03:00 | |||||
04:00 | |||||
05:00 | |||||
06:00 | |||||
07:00 | 5 class | ||||
08:00 | |||||
09:00 | |||||
10:00 | |||||
11:00 | |||||
12:00 | |||||
13:00 | |||||
14:00 | |||||
15:00 | |||||
16:00 | |||||
17:00 | |||||
18:00 | |||||
19:00 | |||||
20:00 | |||||
21:00 | |||||
22:00 | |||||
23:00 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Cabrera, A.; Estremera, E.; Beneyto, A.; Biagi, L.; Contreras, I.; Martín-Fernández, J.A.; Vehí, J. Individualized Prediction of Blood Glucose Outcomes Using Compositional Data Analysis. Mathematics 2023, 11, 4517. https://doi.org/10.3390/math11214517
Cabrera A, Estremera E, Beneyto A, Biagi L, Contreras I, Martín-Fernández JA, Vehí J. Individualized Prediction of Blood Glucose Outcomes Using Compositional Data Analysis. Mathematics. 2023; 11(21):4517. https://doi.org/10.3390/math11214517
Chicago/Turabian StyleCabrera, Alvis, Ernesto Estremera, Aleix Beneyto, Lyvia Biagi, Iván Contreras, Josep Antoni Martín-Fernández, and Josep Vehí. 2023. "Individualized Prediction of Blood Glucose Outcomes Using Compositional Data Analysis" Mathematics 11, no. 21: 4517. https://doi.org/10.3390/math11214517
APA StyleCabrera, A., Estremera, E., Beneyto, A., Biagi, L., Contreras, I., Martín-Fernández, J. A., & Vehí, J. (2023). Individualized Prediction of Blood Glucose Outcomes Using Compositional Data Analysis. Mathematics, 11(21), 4517. https://doi.org/10.3390/math11214517