Setting Ranges in Potential Biomarkers for Type 2 Diabetes Mellitus Patients Early Detection By Sex—An Approach with Machine Learning Algorithms
Abstract
:1. Introduction
- Early Detection: ML models can analyze various parameters, such as biomarkers, clinical data, and medical history, to predict the onset of conditions in patients. By identifying patients at high risk, healthcare providers can take proactive measures to prevent the development of diseases or intervene early to manage conditions.
- Personalized Medicine: ML models can help in the development of personalized treatment plans for patients. By analyzing data such as genetic information, lifestyle, and medical history, ML models can identify the most effective treatment plan for individual patients, reducing the risk of adverse reactions and improving treatment outcomes.
- Predictive Analytics: ML models can be used to predict the likelihood of complications associated with various conditions. By analyzing patient data and identifying the factors that contribute to the development of complications, healthcare providers can take proactive measures for prevention or management, reducing the burden on the healthcare system.
- Remote Monitoring: ML models can be used to remotely monitor patients, reducing the need for frequent hospital visits. By analyzing data such as vital signs and biomarkers, ML models can alert healthcare providers to potential health issues and prompt them to take necessary action.
2. Materials and Methods
2.1. Sample
2.2. Data Treatment
2.3. Data Imputation
2.4. Feature Selection
2.4.1. Genetic Algorithm—GALGO
2.4.2. Forward Selection—GALGO
2.4.3. LASSO
2.4.4. Akaike Information Criterion
2.5. Model Development
2.6. Logistic Regression
2.7. Artificial Neural Networks
- Input Layer: This layer receives the input signals.
- Hidden Layers: These layers perform intermediate computations and feature extraction; there can be one or more hidden layers in an ANN.
- Output Layer: This layer produces the final output of the network.
- Sigmoid:.
- Hyperbolic Tangent (Tanh):.
- Rectified Linear Unit (ReLU):.
- Softmax: Used in the output layer for classification tasks, defined as .
2.8. K-Nearest Neighbors
2.9. Nearest Centroid
2.10. Support Vector Machines
- Linear Kernel:
- Polynomial Kernel:
- Radial Basis Function (RBF) or Gaussian Kernel:
- Sigmoid Kernel:
2.11. Implementation
- GA was implemented using “galgo 1.4” [41].
- The Generalized Linear Model was implemented with “caret” [52].
- For SVM, “caret” [52] was used.
- The ANN was implemented in R with “caret” [52].
- For KNN, “caret” [52] was used.
- Nearcent was implemented in R with “caret” [52].
- LASSO was implemented using “glmnet” [59].
- The ensemble was implemented in R with nested decisions (use of if).
- RFE was implemented in Python version 3.9.7 by Guido van Rossum at Stichting Mathematisch Centrum in the Netherlands.
3. Results
- True positive, number of subjects with T2DM correctly classified.
- False positive, number of healthy subjects incorrectly classified.
- True negative, number of healthy subjects correctly classified.
- False negative, number of subjects with T2DM classified as healthy.
3.1. Feature Selection Results
3.1.1. Features Obtained by GALGO KNN Method
3.1.2. Features Obtained by GALGO Nearcent Method
3.1.3. Features Obtained by GALGO SVM Method
3.1.4. Features Obtained by GALGO LR Method
3.1.5. Features Obtained by GALGO NNET Method
3.1.6. LASSO
3.1.7. RFE
3.2. Ensemble Model Metrics Results
- Log-Likelihood is the maximized value of the likelihood function,
- k is the number of parameters in the model,
- n is the number of observations.
3.3. Ranges
4. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
ACARBO_MG_DIA | Acarbose prescribed in milligrams |
ACARBOSA | If patient has Acarbose treatment |
Age | Age of the patient in years |
Age DX (cases) | Years with diabetes disease diagnosed |
AIC | Akaike Information Criterion |
ANN | Artificial Neural Networks |
AUC | Area Under the Curve |
BA | Biological Age |
BIC | Bayesian Information Criterion |
BMI | Body Mass Index |
CA | Chronological Age |
CHOL | Fat-like substance that is found in all cells of the patient body |
CIC | Corrected Information Criterion |
CK-MB | Creatine Kinase-MB |
Creatinine | Waste product produced by muscles as part of regular daily activity |
DBP | Diastolic Blood Pressure |
DBPU | Diastolic Blood Pressure (uncorrected by medication) |
DIC | Deviance Information Criterion |
DKD | Diabetic Kidney Disease |
DR | Diabetic Retinopathy |
DT | Decision Tree |
Edu | Education Level |
FFANN | Feed-Forward Artificial Neural Network |
FIC | Focused Information Criterion |
FNN | Feed-Forward Neural Network |
FPG | Fasting Plasma Glucose |
FS | Forward Selection |
GA | Genetic Algorithms |
GFR | Glomerular Filtration Rate |
GLIBEN_MG_DIA | Glibenclamide prescribed in milligrams |
GLIBENCLAMIDA | If patient has Glibenclamide treatment |
GLU (mg/dL) | Glucose levels |
GNs | Glomerulopathies |
HA-TX | Subject under Hypertension Treatment |
HbA1c | Glycosylated Hemoglobin |
HDL | High Density Lipoprotein |
HDLU | High Density Lipoprotein (uncorrected by medication) |
id | Consecutive identification number in dataset |
IgAN | Immunoglobulin A Nephropathy |
INSUL_UI_DIA | Insulin prescribed in milligrams |
INSULINA | If patient has Insulin treatment |
IR | Insulin Resistance |
KAI | Kidney Age Index |
KNN | K-Nearest Neighbors |
LASSO | Least Absolute Shrinkage and Selection Operator |
LDL | Low Density Lipoprotein |
LDLU | Low Density Lipoprotein (uncorrected by medication) |
Lipids treatment | Lipid levels in treatment |
LIPIDS-TX | Subject under Lipids Treatment |
LR | Logistic Regression |
METFOR_MG_DIA | Metformin prescribed in milligrams |
METFORMINA | If patient has Metformin treatment |
ML | Machine Learning |
MS | Mass Spectrometry |
NA | Null or not available data |
Nearcent | Nearest Centroid |
NLP | Natural Language Processing |
OPN | Osteopontin |
PIOGLI_MG_DIA | Pioglitazone prescribed in milligrams |
PIOGLITAZONA | If patient has Pioglitazone treatment |
plate_info | Hospital patient id |
RF | Random Forest |
RFE | Recursive Feature Elimination |
ROSIGLI_MG_DIA | Rosiglitazone prescribed in milligrams |
ROSIGLITAZONA | If patient has Rosiglitazone treatment |
Sal | Salary |
SBP | Systolic Blood Pressure |
SBPU | Systolic Blood Pressure (uncorrected by medication) |
Sex | Patients sex |
SVM | Support Vector Machines |
T2DM | Type 2 Diabetes Mellitus |
TCHOLU | Total Cholesterol (uncorrected by medication) |
TGU | Triglycerides (uncorrected by medication) |
TIPO_COMPLICACION | T2DM complications |
DE DT2 | |
Urea | Waste product resulting from the breakdown of protein in the patient body |
WHR | Waist-to-Hip Ratio |
References
- IDF. Diabetes Now Affects One in 10 Adults Worldwide. Online 2024. Available online: https://idf.org/news/diabetes-now-affects-one-in-10-adults-worldwide/ (accessed on 18 June 2024).
- Julia, H.C.; Carol, C. Diabetes treatments and risk of amputation, blindness, severe kidney failure, hyperglycaemia, and hypoglycaemia: Open cohort study in primary care. BMJ 2016, 352, i1450. [Google Scholar] [CrossRef]
- INEGI. Defunciones por Diabetes Mellitus por Entidad Federativa de Residencia Habitual de la Persona Fallecida y Grupo Quinquenal de edad Según Sexo. Serie Anual de 2010 a 2021. Online 2024. Available online: https://www.gob.mx/ (accessed on 18 June 2024).
- Unai, G.G.; Asier, B.V.; Shifa, J.; Asier, L.S.; Haziq, S.; Kepa, B.U.; Ostolaza, H.; Casquet, C.M. Pathophysiology of Type 2 Diabetes Mellitus. Int. J. Mol. Sci. 2020, 21, 6275. [Google Scholar] [CrossRef] [PubMed]
- William, H.H.; Wen, Y.; Simon, J.G.; Rebecca, A.S.; Melanie, J.D.; Kamlesh, K.; Guy, E.H.M.R.; Annelli, S.; Torsten, L.; Knut, B.J.; et al. Early Detection and Treatment of Type 2 Diabetes Reduce Cardiovascular Morbidity and Mortality: A Simulation of the Results of the Anglo-Danish-Dutch Study of Intensive Treatment in People with Screen-Detected Diabetes in Primary Care (ADDITION-Europe). Diabetes Care 2015, 38, 1449–1455. [Google Scholar] [CrossRef]
- Alieva, A.; Alimov, A.; Khaidarova, F.; Ismailov, S.; Rakhimova, G.; Nazhmutdinova, D.; Shagazatova, B.; Tsareva, V. Assessing the Effectiveness of Type 2 Diabetes Mellitus Screening in the Republic of Uzbekistan. Int. J. Endocrinol. Metab. 2022, 20, e124036. [Google Scholar] [CrossRef] [PubMed]
- Moosaie, F.; Fatemi Abhari, S.M.; Deravi, N.; Karimi Behnagh, A.; Esteghamati, S.; Dehghani Firouzabadi, F.; Rabizadeh, S.; Nakhjavani, M.; Esteghamati, A. Waist-to-height ratio is a more accurate tool for predicting hypertension than waist-to-hip circumference and BMI in patients with type 2 diabetes: A prospective study. Front. Public Health 2021, 9, 726288. [Google Scholar] [CrossRef] [PubMed]
- Spurr, S.; Bally, J.; Bullin, C.; Allan, D.; McNair, E. The prevalence of undiagnosed Prediabetes/type 2 diabetes, prehypertension/hypertension and obesity among ethnic groups of adolescents in Western Canada. BMC Pediatr. 2020, 20, 31. [Google Scholar] [CrossRef] [PubMed]
- Saigusa, D.; Matsukawa, N.; Hishinuma, E.; Koshiba, S. Identification of biomarkers to diagnose diseases and find adverse drug reactions by metabolomics. Drug Metab. Pharmacokinet. 2021, 37, 100373. [Google Scholar] [CrossRef] [PubMed]
- Kopitar, L.; Kocbek, P.; Cilar, L.; Sheikh, A.; Štiglic, G. Early detection of Type 2 diabetes mellitus using machine learning-based prediction models. Sci. Rep. 2020, 10, 11981. [Google Scholar] [CrossRef] [PubMed]
- Aruna, D.P. Sex Differences in the Metabolic Syndrome: Implications for cardiovascular health in women. Clin. Chem. 2014, 60, 44–52. [Google Scholar] [CrossRef]
- Rocio Diaz, E.; Pedram, H.; Delevati, G.C.; Hilda, A.; Lucy, C.; Shivanki, J.; Glenda, T.; Guadalupe, J.O.; James, S.; Natalie, T.; et al. Sex differences in global metabolomic profiles of COVID-19 patients. Cell Death Dis. 2022, 13, 461. [Google Scholar] [CrossRef]
- Allen, A.; Barnes, G.L.; Green-Saxena, A.; Hurtado, M.; Hoffman, J.; Mao, Q.; Das, R. Prediction of diabetic kidney disease with machine learning algorithms, upon the initial diagnosis of type 2 diabetes mellitus. BMJ Open Diabetes Res. Care 2022, 10, e002560. [Google Scholar] [CrossRef] [PubMed]
- Chan, L.; Nadkarni, G.N.; Fleming, F.; McCullough, J.R.; Connolly, P.; Mosoyan, G.; Salem, F.; Kattan, M.W.; Vassalotti, J.A.; Murphy, B.; et al. Derivation and validation of a machine learning risk score using biomarker and electronic patient data to predict progression of diabetic kidney disease. Diabetologia 2021, 64, 1504–1515. [Google Scholar] [CrossRef]
- Nagaraj, S.B.; Kieneker, L.M. Kidney Age Index (KAI): A novel age-related biomarker to estimate kidney function in patients with diabetic kidney disease using machine learning. Comput. Methods Programs Biomed. 2021, 211, 106434. [Google Scholar] [CrossRef]
- Moszczuk, B.; Krata, N.; Rudnicki, W.R.; Foroncewicz, B.; Cysewski, D.; Paczek, L.; Kaleta, B.; Mucha, K. Osteopontin—A potential biomarker for IGA nephropathy: Machine learning application. Biomedicines 2022, 10, 734. [Google Scholar] [CrossRef]
- Ou, S.; Tsai, M.J.; Lee, K.; Tseng, W.; Yang, C.; Chen, T.H.; Bin, P.J.; Chen, T.J.; Lin, Y.; Sheu, W.H.; et al. Prediction of the risk of developing end-stage renal diseases in newly diagnosed type 2 diabetes mellitus using artificial intelligence algorithms. BioData Min. 2023, 16, 8. [Google Scholar] [CrossRef] [PubMed]
- Rodríguez-Romero, V.; Bergstrom, R.F.; Decker, B.S.; Lahu, G.; Vakilynejad, M.; Bies, R.R. Prediction of nephropathy in Type 2 diabetes: An analysis of the ACCORD Trial applying machine learning techniques. Clin. Transl. Sci. 2019, 12, 519–528. [Google Scholar] [CrossRef]
- Lin, C.C.; Li, C.I.; Liu, C.S.; Lin, W.Y.; Yang, S.Y.; Li, T. Development and validation of a risk prediction model for end-stage renal disease in patients with Type 2 diabetes. Sci. Rep. 2017, 7, 10177. [Google Scholar] [CrossRef] [PubMed]
- Slieker, R.C.; Van Der Heijden, A.A.W.A.; Siddiqui, M.K.; Langendoen-Gort, M.; Nijpels, G.; Herings, R.M.C.; Feenstra, T.; Moons, K.G.; Bell, S.; Elders, P.; et al. Performance of prediction models for nephropathy in people with Type 2 Diabetes: Systematic review and external validation study. BMJ 2021, 374, n2134. [Google Scholar] [CrossRef] [PubMed]
- Hu, Y.; Shi, R.; Mo, R.; Hu, F. Nomogram for the prediction of diabetic nephropathy risk among patients with Type 2 diabetes mellitus based on a questionnaire and biochemical indicators: A retrospective study. Aging 2020, 12, 10317–10336. [Google Scholar] [CrossRef]
- Jangili, S.; Vavilala, H.; Boddeda, G.S.B.; Upadhyayula, S.M.; Adela, R.; Mutheneni, S.R. Machine learning-driven early biomarker prediction for Type 2 diabetes mellitus associated coronary artery diseases. Clin. Epidemiol. Glob. Health 2023, 24, 101433. [Google Scholar] [CrossRef]
- Castañé, H.; Iftimie, S.; Baiges-Gayà, G.; Rodríguez-Tomàs, E.; Jiménez-Franco, A.; López-Azcona, A.F.; Garrido, P.; Castro, A.; Camps, J.; Joven, J. Machine learning and semi-targeted lipidomics identify distinct serum lipid signatures in hospitalized COVID-19-positive and COVID-19-negative patients. Metabolism 2022, 131, 155197. [Google Scholar] [CrossRef] [PubMed]
- Rojas-García, M.; Vázquez, B.; Torres-Poveda, K.; Madrid-Marina, V. Lethality Risk markers by sex and Age-group for COVID-19 in Mexico: A cross-sectional study based on machine learning approach. BMC Infect. Dis. 2023, 23, 18. [Google Scholar] [CrossRef] [PubMed]
- Agliata, A.; Giordano, D.; Bardozzo, F.; Bottiglieri, S.; Facchiano, A.; Tagliaferri, R. Machine learning as a support for the diagnosis of Type 2 diabetes. Int. J. Mol. Sci. 2023, 24, 6775. [Google Scholar] [CrossRef] [PubMed]
- Frimpong, E.; Oluwasanmi, A.; Baagyere, E.Y.; Qin, Z. A feedforward artificial neural network model for classification and detection of Type 2 diabetes. J. Phys. 2021, 1734, 012026. [Google Scholar] [CrossRef]
- Kumarage, P.M.; Yogarajah, B.; Ratnarajah, N. Efficient Feature Selection for Prediction of Diabetic Using LASSO. In Proceedings of the IEEE-International Conference on Advances in ICT for Emerging Regions, Colombo, Sri Lanka, 2–5 September 2019. [Google Scholar] [CrossRef]
- Oh, E.; Yoo, T.K.; Park, S. Diabetic Retinopathy Risk Prediction for FUNDUS Examination Using Sparse Learning: A Cross-sectional study. BMC Med. Inform. Decis. Mak. 2013, 13, 106. [Google Scholar] [CrossRef] [PubMed]
- Ou, Q.; Jin, W.; Lin, L.; Lin, D.; Chen, K.; Quan, H. LASSO-based machine learning algorithm to predict the incidence of diabetes in different stages. Aging Male 2023, 26, 2205510. [Google Scholar] [CrossRef] [PubMed]
- Singh, Y.; Tiwari, M. A novel hybrid approach for detection of type-2 diabetes in women using lasso regression and artificial neural network. Int. J. Intell. Syst. Appl. 2022, 14, 11–20. [Google Scholar] [CrossRef]
- García-Domínguez, A.; Galván-Tejada, C.E.; Magallanes-Quintanar, R.; Gamboa-Rosales, H.; González-Curiel, I.; Peralta-Romero, J.; Cruz-López, M. Diabetes detection models in Mexican patients by combining machine learning algorithms and feature selection techniques for clinical and paraclinical attributes: A Comparative evaluation. J. Diabetes Res. 2023, 2023, 9713905. [Google Scholar] [CrossRef] [PubMed]
- Lin, X.; Wang, Q.; Yin, P.; Tang, L.; Tan, Y.X.; Li, H.; Yan, K.K.; Xu, G. A method for handling metabonomics data from liquid chromatography/mass spectrometry: Combinational use of support vector machine recursive feature elimination, genetic algorithm and random forest for feature selection. Metabolomics 2011, 7, 549–558. [Google Scholar] [CrossRef]
- Park, A.; Nam, S. MIRDM-RFGA: Genetic algorithm-based identification of a MIRNA set for detecting Type 2 diabetes. BMC Med. Genom. 2023, 16, 195. [Google Scholar] [CrossRef]
- Misra, P.; Yadav, A.S. Improving the Classification Accuracy using Recursive Feature Elimination with Cross-Validation. Int. J. Emerg. Technol. 2020, 11, 659–665. [Google Scholar]
- Sabitha, E.; Durgadevi, M. Improving the diabetes diagnosis prediction rate using data preprocessing, data augmentation and recursive feature elimination method. Int. J. Adv. Comput. Sci. Appl. 2022, 13, 1–11. [Google Scholar] [CrossRef]
- Tiwari, P.; Singh, V.B. Diabetes Disease Prediction using significant Attribute selection and Classification approach. J. Phys. 2021, 1714, 012013. [Google Scholar] [CrossRef]
- Sadhasivam, J.; Muthukumaran, V.; Raja, J.T.; Joseph, R.B.; Munirathanam, M.; Balajee, J. Diabetes Disease Prediction using Decision Tree for feature selection. J. Phys. 2021, 1964, 062116. [Google Scholar] [CrossRef]
- Zhang, Z.; Treviño, V.; Hoseini, S.S.; Belciug, S.; Boopathi, A.M.; Zhang, P.; Gorunescu, F.; Velappan, S.; Dai, S. Variable selection in logistic regression model with genetic algorithm. Ann. Transl. Med. 2018, 6, 45. [Google Scholar] [CrossRef] [PubMed]
- Buyrukoğlu, S.; Akbaş, A. Machine Learning based early prediction of Type 2 diabetes: A new hybrid feature selection approach using correlation matrix with HeatMap and SFS. Balk. J. Electr. Comput. Eng. 2022, 10, 110–117. [Google Scholar] [CrossRef]
- Kautzky-Willer, A.; Harreiter, J.; Pacini, G. Sex and Gender Differences in Risk, Pathophysiology and Complications of Type 2 Diabetes Mellitus. Endocr. Rev. 2016, 37, 278–316. [Google Scholar] [CrossRef] [PubMed]
- Trevino, V.; Falciani, F. GALGO: An R package for multivariate variable selection using genetic algorithms. Bioinformatics 2006, 22, 1154–1156. [Google Scholar] [CrossRef]
- Cavanaugh, J.E.; Neath, A.A. The Akaike Information Criterion: Background, derivation, properties, application, interpretation, and refinements. WIREs Comput. Stat. 2019, 11, e1460. [Google Scholar] [CrossRef]
- Vrieze, S.I. Model selection and psychological theory: A discussion of the differences between the Akaike information criterion (AIC) and the Bayesian information criterion (BIC). Psychol. Methods 2012, 17, 228–243. [Google Scholar] [CrossRef]
- Cavanaugh, J.E. Unifying the derivations for the Akaike and corrected Akaike information criteria. Stat. Probab. Lett. 1997, 33, 201–208. [Google Scholar] [CrossRef]
- Claeskens, G.; Hjort, N.L. The focused information criterion. J. Am. Stat. Assoc. 2003, 98, 900–916. [Google Scholar] [CrossRef]
- Emiliano, P.C.; Vivanco, M.J.; De Menezes, F.S. Information criteria: How do they behave in different models? Comput. Stat. Data Anal. 2014, 69, 141–153. [Google Scholar] [CrossRef]
- Rajendra, P.; Latifi, S. Prediction of diabetes using logistic regression and ensemble techniques. Comput. Methods Programs Biomed. Update 2021, 1, 100032. [Google Scholar] [CrossRef]
- Nusinovici, S.; Tham, Y.C.; Yan, M.Y.C.; Ting, D.S.W.; Li, J.; Sabanayagam, C.; Wong, T.Y.; Cheng, C.Y. Logistic regression was as good as machine learning for predicting major chronic diseases. J. Clin. Epidemiol. 2020, 122, 56–69. [Google Scholar] [CrossRef]
- Fakih, A.H.; Venkatesh, A.N.; Vani, N.; Naved, M.; Kshirsagar, P.R.; Vijayakumar, P. An efficient prediction of diabetes using artificial neural networks. AIP Conf. Proc. 2022, 2393, 020071. [Google Scholar] [CrossRef]
- Khanam, J.J.; Foo, S.Y. A comparison of machine learning algorithms for diabetes prediction. ICT Express 2021, 7, 432–439. [Google Scholar] [CrossRef]
- Bukhari, M.M.; Alkhamees, B.F.; Hussain, S.; Gumaei, A.; Assiri, A.; Ullah, S.S. An Improved Artificial Neural Network Model for Effective Diabetes Prediction. Complexity 2021, 2021, 5525271. [Google Scholar] [CrossRef]
- Kuhn, M. Building Predictive Models in R Using the caret Package. J. Stat. Softw. 2008, 28, 1–26. [Google Scholar]
- Sarker, I.; Faruque, M.; Alqahtani, H.; Kalim, A. K-Nearest Neighbor Learning based Diabetes Mellitus Prediction and Analysis for eHealth Services. ICST Trans. Scalable Inf. Syst. 2018, 7, 26. [Google Scholar] [CrossRef]
- Mucherino, A.; Papajorgji, P.J.; Pardalos, P.M. k-Nearest Neighbor Classification. Data Min. Agric. 2009, 34, 83–106. [Google Scholar] [CrossRef]
- Suyanto, S.; Meliana, S.; Wahyuningrum, T.; Khomsah, S. A new Nearest Neighbor-based framework for diabetes detection. Expert Syst. Appl. 2022, 199, 116857. [Google Scholar] [CrossRef]
- Arora, N.; Singh, A.; Al-Dabagh, M.Z.N.; Maitra, S.K. A Novel Architecture for Diabetes Patients’ Prediction Using K-Means Clustering and SVM. Math. Probl. Eng. 2022, 2022, 4815521. [Google Scholar] [CrossRef]
- Mujumdar, A.; Vaidehi, V. Diabetes Prediction using Machine Learning Algorithms. Procedia Comput. Sci. 2019, 165, 292–299. [Google Scholar] [CrossRef]
- Amari, S.; Wu, S. Improving support vector machine classifiers by modifying kernel functions. Neural Netw. 1999, 12, 783–789. [Google Scholar] [CrossRef]
- Friedman, J.; Hastie, T.; Tibshirani, R. Regularization Paths for Generalized Linear Models via Coordinate Descent. J. Stat. Softw. 2010, 33, 1–22. [Google Scholar] [CrossRef] [PubMed]
- Kanaya, A.M.; Grady, D.; Barrett-Connor, E. Explaining the sex difference in coronary heart disease mortality among patients with type 2 diabetes mellitus. Arch. Intern. Med. 2002, 162, 1737. [Google Scholar] [CrossRef] [PubMed]
- Bolen, S.; Feldman, L.; Vassy, J.L.; Wilson, L.M.; Yeh, H.C.; Marinopoulos, S.S.; Wiley, C.; Selvin, E.; Wilson, R.F.; Bass, E.B.; et al. Systematic Review: Comparative Effectiveness and Safety of oral medications for Type 2 Diabetes Mellitus. Ann. Intern. Med. 2007, 147, 386. [Google Scholar] [CrossRef] [PubMed]
- Siransy-Balayssac, E.; Ouattara, S.; Yéo, T.A.; Kondo, A.L.; Touré, M.; Dah, C.; Bogui, P. Physiological variations of blood pressure according to gender and age among healthy young Black Africans aged between 18 and 30 years in Côte d’Ivoire, West Africa. Physiol. Rep. 2020, 8, e14579. [Google Scholar] [CrossRef]
- Geer, E.B.; Shen, W. Gender differences in insulin resistance, body composition, and energy balance. Gend. Med. 2009, 6, 60–75. [Google Scholar] [CrossRef]
- Lam, B.C.C.; Koh, G.C.H.; Chen, C.; Wong, M.; Fallows, S. Comparison of body mass Index (BMI), Body adiposity Index (BAI), waist circumference (WC), Waist-To-Hip Ratio (WHR) and Waist-To-Height Ratio (WHTR) as predictors of cardiovascular disease risk factors in an adult population in Singapore. PLoS ONE 2015, 10, e0122985. [Google Scholar] [CrossRef]
- Halbesma, N.; Brantsma, A.H.; Bakker, S.J.L.; Jansen, D.; Stolk, R.P.; De Zeeuw, D.; De Jong, P.E.; Gansevoort, R. Gender differences in predictors of the decline of renal function in the general population. Kidney Int. 2008, 74, 505–512. [Google Scholar] [CrossRef]
- Farran, B.; AlWotayan, R.; Alkandari, H.; Al-Abdulrazzaq, D.; Channanath, A.; Thanaraj, T.A. Use of Non-invasive Parameters and Machine-Learning Algorithms for Predicting Future Risk of Type 2 Diabetes: A Retrospective Cohort Study of Health Data From Kuwait. Front. Endocrinol. 2019, 10, 624. [Google Scholar] [CrossRef]
- Wannamethee, S.G.; Papacosta, O.; Whincup, P.H.; Thomas, M.C.; Carson, C.; Lawlor, D.A.; Ebrahim, S.; Sattar, N. The potential for a two-stage diabetes risk algorithm combining non-laboratory-based scores with subsequent routine non-fasting blood tests: Results from prospective studies in older men and women. Diabet. Med. 2010, 28, 23–30. [Google Scholar] [CrossRef]
- Meerson, A.; Najjar, A.; Saad, E.; Sbeit, W.; Barhoum, M.; Assy, N. Sex Differences in Plasma MicroRNA Biomarkers of Early and Complicated Diabetes Mellitus in Israeli Arab and Jewish Patients. Non-Coding RNA 2019, 5, 32. [Google Scholar] [CrossRef]
- Akash, M.S.H.; Rehman, K.; Liaqat, A.; Numan, M.; Mahmood, Q.; Kamal, S. Biochemical investigation of gender-specific association between insulin resistance and inflammatory biomarkers in types 2 diabetic patients. Biomed. Pharmacother. 2018, 106, 285–291. [Google Scholar] [CrossRef]
Feature | Description |
---|---|
plate_info | Hospital patient id |
id | Consecutive identification number in dataset |
Edu | Education Level |
Sal | Salary |
Age DX (cases) | Years with diabetes disease diagnosticated |
GLU (mg/dL) | Glucose levels |
HbA1c | Glycosylated hemoglobin |
GLIBENCLAMIDA | If patient has Glibenclamide treatment |
GLIBEN_MG_DIA | Glibenclamide prescribed in milligrams |
METFORMINA | If patient has Metformin treatment |
METFOR_MG_DIA | Metformin prescribed in milligrams |
PIOGLITAZONA | If patient has Pioglitazone treatment |
PIOGLI_MG_DIA | Pioglitazone prescribed in milligrams |
ROSIGLITAZONA | If patient has Rosiglitazone treatment |
ROSIGLI_MG_DIA | Rosiglitazone prescribed in milligrams |
ACARBOSA | If patient has Acarbose treatment |
ACARBO_MG_DIA | Acarbose prescribed in milligrams |
INSULINA | If patient has Insulin treatment |
INSUL_UI_DIA | Insulin prescribed in milligrams |
TIPO_COMPLICACION DE DT2 | T2DM complications |
1. Patients must be at least 18 years old. |
2. There must be no differentiation on the data obtained based on sex, |
education, ethnicity, race, or marital status. |
3. The datasets should exclusively comprise anthropometric and clinic |
data for each individual. |
4. The dataset should be capable of distinguishing control |
subjects from those with T2DM |
5. Every subject’s dataset must include complete information |
for all features. |
6. The data does not contain negative values |
7. It does not contain glucose related features |
Feature | Description |
---|---|
Sex | Patients sex |
Age | Age of the patient in years |
WHR | Waist to Hip Ratio |
BMI | Body Mass Index |
Urea | Waste product resulting from the breakdown of protein in the patient body |
Creatinine | Waste product produced by muscles as part of regular daily activity |
Lipids treatment | Lipid levels in treatment |
Cholesterol | Fat-like substance that is found in all cells of the patient body |
HDL | High Density Lipoprotein (corrected by medication) |
LDL | Low Density Lipoprotein (corrected by medication) |
Triglycerides | Type of fat found in the patient body |
TCHOLU | Total Cholesterol (uncorrected by medication) |
HDLU | High Density Lipoprotein (uncorrected by medication) |
LDLU | Low Density Lipoprotein (uncorrected by medication) |
TGU | Triglycerides (uncorrected by medication) |
SBP | Systolic Blood Pressure (corrected by medication) |
DBP | Diastolic Blood Pressure (corrected by medication) |
SBPU | Systolic Blood Pressure (uncorrected by medication) |
DBPU | Diastolic Blood Pressure (uncorrected by medication) |
HA-TX | Subject under Hypertension Treatment |
LIPIDS-TX | Subject under Lipids Treatment |
Model | Parameter | Value |
---|---|---|
KNN | classification.method | ‘knn’ |
chromosomeSize | 5 | |
maxSolutions | 2000 | |
maxGenerations | 60 | |
goalFitness | 0.9 | |
Nearcent | classification.method | ‘nearcent’ |
chromosomeSize | 5 | |
maxSolutions | 2000 | |
maxGenerations | 60 | |
goalFitness | 0.9 | |
Artificial Neural Network | classification.method | ‘nnet’ |
chromosomeSize | 5 | |
maxSolutions | 2000 | |
maxGenerations | 60 | |
goalFitness | 0.9 | |
Logistic Regression | classification.method | ‘user’ |
classification.userFitnessFunc | logreg.R.predict | |
chromosomeSize | 5 | |
maxSolutions | 2000 | |
maxGenerations | 60 | |
goalFitness | 0.9 | |
Support Vector Machines | classification.method | ‘svm’ |
svm.kernel | ‘radial’ | |
chromosomeSize | 5 | |
maxSolutions | 2000 | |
maxGenerations | 60 | |
goalFitness | 0.9 |
Model | Parameter | Value |
---|---|---|
KNN | classification.method | ‘knn’ |
chromosomeSize | 5 | |
maxSolutions | 1600 | |
maxGenerations | 60 | |
goalFitness | 0.9 | |
Nearcent | classification.method | ‘nearcent’ |
chromosomeSize | 5 | |
maxSolutions | 1600 | |
maxGenerations | 60 | |
goalFitness | 0.9 | |
Artificial Neural Network | classification.method | ‘nnet’ |
chromosomeSize | 5 | |
maxSolutions | 1600 | |
maxGenerations | 60 | |
goalFitness | 0.9 | |
Logistic Regression | classification.method | ‘user’ |
classification.userFitnessFunc | logreg.R.predict | |
chromosomeSize | 5 | |
maxSolutions | 1600 | |
maxGenerations | 60 | |
goalFitness | 0.9 | |
Support Vector Machines | classification.method | ‘svm’ |
svm.kernel | ‘radial’ | |
chromosomeSize | 5 | |
maxSolutions | 1600 | |
maxGenerations | 60 | |
goalFitness | 0.9 |
Metric | Description |
---|---|
Sensitivity | Correct identification of patients with T2DM (True Positive). |
Specificity | Correct identification of patients without T2DM (True Negative). |
Precision | Defines what portion of the positive cases |
of T2DM are actually positive. | |
Negative Predictive Value | Defines what portion of the negative cases |
of T2DM are actually negative. | |
False Positive Rate | The rate of the predicted false values that are actually true. |
False Negative Rate | The rate of the predicted true values that are actually false. |
Accuracy | The percentage of cases that the model has classified correctly. |
F1 Score | The measure of precision that a test has. |
True Values | Predicted (True) | Predicted (False) |
---|---|---|
True | ||
False |
Technique | Kernel | Dataset | Table |
---|---|---|---|
GALGO | KNN | Overall | Table 9 |
Male/Female | Table 10 | ||
Nearcent | Overall | Table 11 | |
Male/Female | Table 12 | ||
SVM | Overall | Table 13 | |
Male/Female | Table 14 | ||
LR | Overall | Table 15 | |
Male/Female | Table 16 | ||
NNET | Overall | Table 17 | |
Male/Female | Table 18 | ||
LASSO | Overall | Table 19 | |
Male/Female | Table 20 | ||
RFE | LR, SVM and RF | Overall | Table 21 |
LR | Male/Female | Table 22 | |
SVM | Male/Female | Table 23 | |
RF | Male/Female | Table 24 |
Siglo XXI Overall |
---|
“Creatinine”, “TGU”, “TCHOLU”, “Sex”, “WHR”, “SBP”, “SBPU”, “Cholesterol” and “Urea” |
Siglo XXI Male Dataset | Siglo XXI Female Dataset |
---|---|
“Creatinine” | “Creatinine” |
“SBP” | “TGU” |
“DBP” | “SBP” |
“SBPU” | “WHR” |
“Age” | “Cholesterol” |
“TGU” | “SBPU” |
“BMI” | “BMI” |
“Hypertension Treatment” | “Urea” |
“HDLU” | “TCHOLU” |
“Urea” | “Age” |
“TCHOLU” | “LDLU” |
“Hypertension Treatment” | |
“LDL” | |
“Triglycerides” |
Siglo XXI Overall |
---|
“Creatinine”, “TGU”, “Sex”, “SBPU”, “LDL”, “SBP”, “BMI”, |
“WHR”, “Age”, “Triglycerides”, “Cholesterol”, “LDLU”, “Urea”, |
“TCHOLU”, “HDLU” and “Lipids Treatment” |
Siglo XXI Male Dataset | Siglo XXI Female Dataset |
---|---|
“Creatinine” | “Creatinine” |
“TGU” | “TGU” |
“Cholesterol” | “LDL” |
“SBP” | “SBP” |
“BMI” | |
“SBPU” | |
“Cholesterol” |
Siglo XXI Overall |
---|
“Sex”, “Creatinine”, “TGU”, “SBP”, “LDLU”, “SBPU”, “LDL”, “TCHOLU”, |
“Cholesterol”, “BMI”, “Age”, “WHR”, “Urea”, |
“Lipids Treatment”, “HDL”, “Triglycerides”, “HDLU” and “Hypertension Treatment” |
Siglo XXI Male Dataset | Siglo XXI Female Dataset |
---|---|
“Creatinine” | “Creatinine” |
“TGU” | “TGU” |
“SBPU” | “SBP” |
“SBP” | “LDL” |
“Age” | “LDLU” |
“Cholesterol” | “SBPU” |
“TCHOLU” | “BMI” |
“BMI” | “TCHOLU” |
“Cholesterol” |
Siglo XXI Overall |
---|
“Sex”, “Creatinine”, “SBP”, “TGU”, “TCHOLU”, “Cholesterol”, “SBPU”, |
“LDL”, “LDLU”, “Hypertension Treatment”, “DBP”, “WHR”, |
“Lipids Treatment”, “HDL”, “Triglycerides”, “HDLU”, “Age” and “BMI” |
Siglo XXI Male Dataset | Siglo XXI Female Dataset |
---|---|
“SBP”, | “TGU” |
“Creatinine” | “Creatinine” |
“TGU” | “WHR” |
“Cholesterol” | “LDL” |
“LDLU” | |
“SBP” | |
“SBPU” | |
“TCHOLU” | |
“Cholesterol” | |
“BMI” | |
“Lipids Treatment” | |
“Triglycerides” | |
“Age” |
Siglo XXI Overall |
---|
“Creatinine” and “TGU” |
Siglo XXI Male Dataset | Siglo XXI Female Dataset |
---|---|
“Creatinine” | “TGU” |
“SBP” | “Creatinine” |
“TGU” | “SBP” |
“SBPU” | “WHR” |
“TCHOLU” | “LDLU” |
“Cholesterol” | “LDL” |
“BMI” | “SBPU” |
“Hypertension Treatment” | “Cholesterol” |
“Urea” | “BMI” |
“DBP” | “TCHOLU” |
“Age” | “Age” |
“LDL” | “Lipids Treatment” |
“HDL” | “Triglycerides” |
“HDLU” | |
“LDLU” |
Siglo XXI Overall |
---|
“Sex”, “Age”, “WHR”, “BMI”, “Urea”, “Lipids Treatment”, |
“HDL”, “Triglycerides”, “Hypertension Treatment”, “DBP” and “SBPU” |
Siglo XXI Male Dataset | Siglo XXI Female Dataset |
---|---|
“Age” | “Age” |
“WHR” | “BMI” |
“Urea” | “Urea” |
“Lipids Treatment” | “Lipids Treatment” |
“HDL” | “HDL” |
“Triglycerides” | “Triglycerides” |
“Hypertension Treatment” | “Hypertension Treatment” |
“DBP” | “DBP” |
“SBPU” |
Siglo XXI Overall | ||
---|---|---|
LR | SVM | RF |
“Sex” | “Sex” | “Age” |
“WHR” | “WHR” | “Lipids Treatment” |
“Cholesterol” | “Creatinine” | “Triglycerides” |
“TCHOLU” | “Cholesterol” | “DBP” |
“SBP” | “SBP” | “DBP” |
“Hypertension Treatment” | “TCHOLU” | |
“DBP” | “DBP” | |
“SBPU” | “SBPU” | |
“DBPU” | “DBPU” |
Siglo XXI Male Dataset | Siglo XXI Female Dataset |
---|---|
“Cholesterol” | “WHR” |
“TCHOLU” | “Cholesterol” |
“SBP” | “Triglycerides” |
“DBP” | “TCHOLU” |
“SBPU” | “TGU” |
“SBP” | |
“DBP” | |
“SBPU” |
Siglo XXI Male Dataset | Siglo XXI Female Dataset |
---|---|
“Age” | “Age” |
“WHR” | “WHR” |
“Creatinine” | “BMI” |
“SBP” | “Creatinine” |
“DBP” | “Cholesterol” |
“SBPU” | “TCHOLU” |
“DBPU” | “SBP” |
“DBP” | |
“SBPU” | |
“DBPU” |
Siglo XXI Male Dataset | Siglo XXI Female Dataset |
---|---|
“Age” | “Age” |
“BMI” | “WHR” |
“Lipids Treatment” | “BMI” |
“DBP” | “Triglycerides” |
“DBPU” | “DBP” |
Model | Dataset | Table |
---|---|---|
Overall | Table 26 | |
RFE | Male | Table 27 |
Female | Table 28 | |
Overall | Table 29 | |
GALGO | Male | Table 30 |
Female | Table 31 | |
LASSO | Overall/Male/Female | Table 32 |
Measure | LR | SVM | RF |
---|---|---|---|
Sensitivity | 0.8127 | 0.8137 | 0.8682 |
Specificity | 0.8640 | 0.8543 | 0.8958 |
Precision | 0.8645 | 0.8526 | 0.8924 |
Negative Predictive Value | 0.8120 | 0.8158 | 0.8722 |
False Positive Rate | 0.1360 | 0.1457 | 0.1042 |
False Discovery Rate | 0.1355 | 0.1474 | 0.1076 |
False Negative Rate | 0.1873 | 0.1863 | 0.1318 |
Accuracy | 0.8375 | 0.8337 | 0.8820 |
F1 Score | 0.8378 | 0.8327 | 0.8802 |
Measure | LR | SVM | RF |
---|---|---|---|
Sensitivity | 0.8314 | 0.8452 | 0.8314 |
Specificity | 0.9167 | 0.8119 | 0.9167 |
Precision | 0.9533 | 0.8733 | 0.9533 |
Negative Predictive Value | 0.7264 | 0.7736 | 0.7264 |
False Positive Rate | 0.0833 | 0.1881 | 0.0833 |
False Discovery Rate | 0.0467 | 0.1267 | 0.0467 |
False Negative Rate | 0.1686 | 0.1548 | 0.1686 |
Accuracy | 0.8594 | 0.8320 | 0.8594 |
F1 Score | 0.8882 | 0.8590 | 0.8882 |
Measure | LR | SVM | RF |
---|---|---|---|
Sensitivity | 0.7545 | 0.7455 | 0.7333 |
Specificity | 0.8800 | 0.8733 | 0.7941 |
Precision | 0.8218 | 0.8119 | 0.6535 |
Negative Predictive Value | 0.8302 | 0.8239 | 0.8491 |
False Positive Rate | 0.1200 | 0.1267 | 0.2059 |
False Discovery Rate | 0.1782 | 0.1881 | 0.3465 |
False Negative Rate | 0.2455 | 0.2545 | 0.2667 |
Accuracy | 0.8269 | 0.8192 | 0.7731 |
F1 Score | 0.7867 | 0.7773 | 0.6911 |
Measure | KNN | Nearcent | LR | SVM | NN |
---|---|---|---|---|---|
Sensitivity | 0.8028 | 0.8377 | 0.8690 | 0.8511 | 0.5781 |
Specificity | 0.9167 | 0.8849 | 0.8792 | 0.8902 | 0.6054 |
Precision | 0.9243 | 0.8845 | 0.8725 | 0.8884 | 0.5896 |
Negative Predictive Value | 0.7857 | 0.8383 | 0.8759 | 0.8534 | 0.5940 |
False Positive Rate | 0.0833 | 0.1151 | 0.1208 | 0.1098 | 0.3946 |
False Discovery Rate | 0.0757 | 0.1155 | 0.1275 | 0.1116 | 0.4104 |
False Negative Rate | 0.1972 | 0.1623 | 0.1310 | 0.1489 | 0.4219 |
Accuracy | 0.8530 | 0.8607 | 0.8743 | 0.8704 | 0.5919 |
F1 Score | 0.8593 | 0.8605 | 0.8708 | 0.8694 | 0.5838 |
Measure | KNN | Nearcent | LR | SVM | NN |
---|---|---|---|---|---|
Sensitivity | 0.8373 | 0.6856 | 0.6859 | 0.8235 | 0.8443 |
Specificity | 0.8778 | 0.7258 | 0.7077 | 0.8837 | 0.8989 |
Precision | 0.9267 | 0.8867 | 0.8733 | 0.9333 | 0.9400 |
Negative Predictive Value | 0.7453 | 0.4245 | 0.4340 | 0.7170 | 0.7547 |
False Positive Rate | 0.1222 | 0.2742 | 0.2923 | 0.1163 | 0.1011 |
False Discovery Rate | 0.0733 | 0.1133 | 0.1267 | 0.0667 | 0.0600 |
False Negative Rate | 0.1627 | 0.3144 | 0.3141 | 0.1765 | 0.1557 |
Accuracy | 0.8516 | 0.6953 | 0.6914 | 0.8438 | 0.8633 |
F1 Score | 0.8797 | 0.7733 | 0.7683 | 0.8750 | 0.8896 |
Measure | KNN | Nearcent | LR | SVM | NN |
---|---|---|---|---|---|
Sensitivity | 0.7658 | 0.6804 | 0.7719 | 0.7000 | 0.7748 |
Specificity | 0.8926 | 0.7853 | 0.9110 | 0.8786 | 0.8993 |
Precision | 0.8416 | 0.6535 | 0.8713 | 0.8317 | 0.8515 |
Negative Predictive Value | 0.8365 | 0.8050 | 0.8365 | 0.7736 | 0.8428 |
False Positive Rate | 0.1074 | 0.2147 | 0.0890 | 0.1214 | 0.1007 |
False Discovery Rate | 0.1584 | 0.3465 | 0.1287 | 0.1683 | 0.1485 |
False Negative Rate | 0.2342 | 0.3196 | 0.2281 | 0.3000 | 0.2252 |
Accuracy | 0.8385 | 0.7462 | 0.8500 | 0.7962 | 0.8462 |
F1 Score | 0.8019 | 0.6667 | 0.8186 | 0.7602 | 0.8113 |
Measure | Overall | Male | Female |
---|---|---|---|
Sensitivity | 0.8760 | 0.8443 | 0.7500 |
Specificity | 0.8801 | 0.8989 | 0.8851 |
Precision | 0.8725 | 0.9400 | 0.8317 |
Negative Predictive Value | 0.8835 | 0.7547 | 0.8239 |
False Positive Rate | 0.1199 | 0.1011 | 0.1149 |
False Discovery Rate | 0.1275 | 0.0600 | 0.1683 |
False Negative Rate | 0.1240 | 0.1557 | 0.2500 |
Accuracy | 0.8781 | 0.8633 | 0.8269 |
F1 Score | 0.8743 | 0.8896 | 0.7887 |
Dataset | Controls—Cases | Table |
---|---|---|
Overall | Controls and Cases | Table 34 |
Male | Controls and Cases | Table 35 |
Female | Controls and Cases | Table 36 |
Overall | Controls | Table 37 |
Male | Controls | Table 38 |
Female | Controls | Table 39 |
Overall | Cases | Table 40 |
Male | Cases | Table 41 |
Female | Cases | Table 42 |
Feature | Min | Lower Quartile | Median | Upper Quartile | Max |
---|---|---|---|---|---|
Age | 30 | 45 | 52 | 60 | 82 |
WHR | 0.72 | 0.87 | 0.92 | 0.97 | 1.11 |
BMI | 17.00 | 25.26 | 27.90 | 31.24 | 40.20 |
Urea | 6 | 24 | 28 | 36 | 54 |
Creatinine | 0.36 | 0.68 | 0.79 | 0.93 | 1.30 |
Cholesterol | 75.00 | 167.00 | 195.05 | 230.00 | 324.10 |
HDL | 10 | 35 | 43 | 52 | 77 |
LDL | 45.0 | 112.0 | 136.0 | 162.9 | 239.0 |
Triglycerides | 32 | 116 | 156 | 219 | 371 |
TCHOLU | 82 | 161 | 187 | 215 | 296 |
HDLU | 12 | 36 | 44 | 52 | 76 |
LDLU | 45 | 107 | 127 | 149 | 212 |
TGU | 32 | 110 | 148 | 207 | 350 |
SBP | 80 | 110 | 120 | 130 | 160 |
SBPU | 80 | 110 | 120 | 130 | 160 |
DBP | 50 | 70 | 80 | 85 | 105 |
DBPU | 55 | 70 | 80 | 80 | 95 |
Feature | Min | Lower Quartile | Median | Upper Quartile | Max |
---|---|---|---|---|---|
Age | 30 | 44 | 50 | 59 | 81 |
WHR | 0.82 | 0.92 | 0.95 | 0.99 | 1.09 |
BMI | 18.400 | 25.245 | 27.400 | 29.950 | 36.810 |
Urea | 6 | 24 | 30 | 36 | 54 |
Creatinine | 0.44 | 0.77 | 0.88 | 1.01 | 1.37 |
Cholesterol | 75.00 | 160.50 | 186.00 | 219.05 | 303.10 |
HDL | 10.1 | 32.0 | 40.0 | 47.0 | 69.1 |
LDL | 45.0 | 109.0 | 131.0 | 157.0 | 225.9 |
Triglycerides | 43.00 | 115.20 | 155.00 | 221.55 | 380.00 |
TCHOLU | 82 | 155 | 180 | 207 | 285 |
HDLU | 15 | 33 | 40 | 48 | 70 |
LDLU | 45 | 102 | 125 | 145 | 203 |
TGU | 43 | 112 | 150 | 210 | 357 |
SBP | 90.0 | 110.5 | 120.0 | 130.0 | 158.0 |
SBPU | 80 | 110 | 120 | 130 | 160 |
DBP | 50 | 69 | 76 | 85 | 105 |
DBPU | 55 | 69 | 76 | 80 | 95 |
Feature | Min | Lower Quartile | Median | Upper Quartile | Max |
---|---|---|---|---|---|
Age | 30 | 47 | 54 | 60 | 79 |
WHR | 0.71 | 0.84 | 0.89 | 0.93 | 1.06 |
BMI | 17.20 | 25.28 | 28.60 | 32.46 | 43.21 |
Urea | 9 | 24 | 28 | 34 | 49 |
Creatinine | 0.360 | 0.625 | 0.720 | 0.810 | 1.080 |
Cholesterol | 100.0 | 177.0 | 205.0 | 237.0 | 325.1 |
HDL | 12.00 | 38.05 | 46.70 | 56.00 | 82.00 |
LDL | 60.0 | 116.0 | 140.0 | 167.9 | 243.9 |
Triglycerides | 32.00 | 116.00 | 156.00 | 216.05 | 364.00 |
TCHOLU | 96 | 168 | 193 | 220 | 296 |
HDLU | 16 | 40 | 48 | 57 | 82 |
LDLU | 49 | 109 | 130 | 153 | 219 |
TGU | 32 | 110 | 144 | 204 | 341 |
SBP | 80 | 110 | 120 | 130 | 160 |
SBPU | 80 | 110 | 120 | 130 | 160 |
DBP | 50 | 70 | 80 | 85 | 105 |
DBPU | 50 | 70 | 80 | 85 | 100 |
Feature | Min | Lower Quartile | Median | Upper Quartile | Max |
---|---|---|---|---|---|
Age | 34 | 43 | 47 | 53 | 68 |
WHR | 0.74 | 0.87 | 0.92 | 0.96 | 1.09 |
BMI | 18.00 | 24.90 | 27.10 | 29.65 | 36.40 |
Urea | 12 | 24 | 28 | 32 | 43 |
Creatinine | 0.42 | 0.70 | 0.79 | 0.91 | 1.22 |
Cholesterol | 87.0 | 162.0 | 186.0 | 214.5 | 291.0 |
HDL | 17 | 38 | 45 | 53 | 75 |
LDL | 45 | 108 | 128 | 151 | 214 |
Triglycerides | 32 | 101 | 134 | 181 | 299 |
TCHOLU | 87.0 | 162.0 | 186.0 | 214.5 | 291.0 |
HDLU | 17 | 38 | 45 | 53 | 75 |
LDLU | 45 | 108 | 128 | 151 | 214 |
TGU | 32 | 101 | 134 | 181 | 299 |
SBP | 90 | 110 | 119 | 127 | 152 |
SBPU | 90 | 110 | 119 | 127 | 152 |
DBP | 46 | 66 | 70 | 80 | 100 |
DBPU | 46 | 66 | 70 | 80 | 100 |
Feature | Min | Lower Quartile | Median | Upper Quartile | Max |
---|---|---|---|---|---|
Age | 34 | 42 | 46 | 51 | 64 |
WHR | 0.83 | 0.91 | 0.94 | 0.97 | 1.05 |
BMI | 19.10 | 25.25 | 27.30 | 29.45 | 35.60 |
Urea | 13 | 24 | 28 | 32 | 43 |
Creatinine | 0.440 | 0.750 | 0.855 | 0.965 | 1.270 |
Cholesterol | 87 | 159 | 181 | 208 | 281 |
HDL | 15 | 34 | 41 | 49 | 71 |
LDL | 45.0 | 105.0 | 126.0 | 146.5 | 203.0 |
Triglycerides | 44 | 107 | 140 | 192 | 318 |
TCHOLU | 87 | 159 | 181 | 208 | 281 |
HDLU | 15 | 34 | 41 | 49 | 71 |
LDLU | 45.0 | 105.0 | 126.0 | 146.5 | 203.0 |
TGU | 44 | 107 | 140 | 192 | 318 |
SBP | 90 | 111 | 120 | 128 | 152 |
SBPU | 90 | 111 | 120 | 128 | 152 |
DBP | 55 | 65 | 70 | 78 | 95 |
DBPU | 55 | 65 | 70 | 78 | 95 |
Feature | Min | Lower Quartile | Median | Upper Quartile | Max |
---|---|---|---|---|---|
Age | 34.0 | 43.0 | 49.0 | 54.5 | 71.0 |
WHR | 0.65 | 0.81 | 0.87 | 0.93 | 1.11 |
BMI | 17.20 | 24.50 | 26.80 | 29.95 | 38.10 |
Urea | 11 | 21 | 26 | 32 | 47 |
Creatinine | 0.42 | 0.62 | 0.72 | 0.80 | 1.06 |
Cholesterol | 100 | 167 | 195 | 220 | 291 |
HDL | 19.0 | 43.0 | 51.0 | 59.5 | 84.0 |
LDL | 60 | 111 | 132 | 155 | 221 |
Triglycerides | 32 | 96 | 125 | 158 | 244 |
TCHOLU | 100 | 167 | 195 | 220 | 291 |
HDLU | 19.0 | 43.0 | 51.0 | 59.5 | 84.0 |
LDLU | 60 | 111 | 132 | 155 | 221 |
TGU | 32 | 96 | 125 | 158 | 244 |
SBP | 90 | 109 | 113 | 124 | 145 |
SBPU | 90 | 109 | 113 | 124 | 145 |
DBP | 50 | 68 | 71 | 80 | 95 |
DBPU | 50 | 68 | 71 | 80 | 95 |
Feature | Min | Lower Quartile | Median | Upper Quartile | Max |
---|---|---|---|---|---|
Age | 31 | 50 | 57 | 63 | 82 |
WHR | 0.75 | 0.88 | 0.92 | 0.97 | 1.10 |
BMI | 17.48 | 25.81 | 28.95 | 32.68 | 42.96 |
Urea | 9 | 26 | 30 | 39 | 58 |
Creatinine | 0.26 | 0.67 | 0.79 | 0.95 | 1.35 |
Cholesterol | 91.00 | 176.00 | 207.10 | 242.05 | 340.00 |
HDL | 10 | 33 | 41 | 50 | 75 |
LDL | 51.0 | 118.0 | 143.0 | 175.0 | 256.9 |
Triglycerides | 41.00 | 139.00 | 185.00 | 253.05 | 417.10 |
TCHOLU | 91 | 161 | 187 | 215 | 296 |
HDLU | 12.0 | 35.5 | 42.0 | 52.0 | 76.0 |
LDLU | 49 | 106 | 127 | 148 | 209 |
TGU | 41 | 123 | 164 | 230 | 387 |
SBP | 70 | 110 | 130 | 140 | 180 |
SBPU | 80 | 110 | 120 | 130 | 160 |
DBP | 65 | 80 | 85 | 90 | 105 |
DBPU | 70 | 80 | 80 | 90 | 100 |
Feature | Min | Lower Quartile | Median | Upper Quartile | Max |
---|---|---|---|---|---|
Age | 30 | 50 | 58 | 65 | 86 |
WHR | 0.84 | 0.94 | 0.97 | 1.01 | 1.11 |
BMI | 17.480 | 25.245 | 27.810 | 31.015 | 39.340 |
Urea | 6 | 26 | 32 | 43 | 66 |
Creatinine | 0.54 | 0.79 | 0.93 | 1.08 | 1.51 |
Cholesterol | 91.00 | 165.05 | 194.10 | 237.10 | 340.00 |
HDL | 10.10 | 30.10 | 36.70 | 44.05 | 63.00 |
LDL | 51.0 | 113.0 | 138.0 | 168.5 | 250.9 |
Triglycerides | 43.0 | 134.2 | 182.0 | 257.7 | 433.0 |
TCHOLU | 91.0 | 151.0 | 180.0 | 204.5 | 284.0 |
HDLU | 16 | 31 | 39 | 45 | 66 |
LDLU | 51.0 | 99.0 | 122.0 | 143.5 | 201.0 |
TGU | 43.0 | 122.0 | 166.0 | 230.5 | 377.0 |
SBP | 80 | 110 | 130 | 140 | 180 |
SBPU | 80 | 110 | 120 | 130 | 160 |
DBP | 65 | 80 | 85 | 90 | 105 |
DBPU | 70 | 80 | 80 | 90 | 100 |
Feature | Min | Lower Quartile | Median | Upper Quartile | Max |
---|---|---|---|---|---|
Age | 34.0 | 50.5 | 56.0 | 63.0 | 79.0 |
WHR | 0.73 | 0.85 | 0.89 | 0.93 | 1.05 |
BMI | 17.84 | 26.53 | 29.77 | 33.61 | 43.96 |
Urea | 6 | 24 | 30 | 36 | 54 |
Creatinine | 0.36 | 0.63 | 0.72 | 0.82 | 1.10 |
Cholesterol | 108.00 | 184.05 | 213.55 | 245.60 | 333.10 |
HDL | 12.0 | 36.0 | 44.1 | 53.0 | 78.0 |
LDL | 62.00 | 123.00 | 147.95 | 179.90 | 264.90 |
Triglycerides | 41.00 | 141.05 | 188.25 | 248.90 | 407.00 |
TCHOLU | 96.0 | 168.5 | 192.0 | 220.0 | 296.0 |
HDLU | 16 | 38 | 46 | 54 | 78 |
LDLU | 49.0 | 108.0 | 129.5 | 150.0 | 209.0 |
TGU | 41.0 | 123.0 | 164.0 | 229.5 | 387.0 |
SBP | 90 | 115 | 130 | 140 | 175 |
SBPU | 80 | 110 | 120 | 130 | 160 |
DBP | 65 | 80 | 85 | 90 | 105 |
DBPU | 70 | 80 | 80 | 90 | 100 |
Title | Technique | Measures |
---|---|---|
Use of Non-invasive Parameters and Machine Learning Algorithms for Predicting Future Risk of T2DM: A Retrospective Cohort Study of Health Data From Kuwait [66] | KNN, LR and SVM | AUC KNN (0.83), AUC LR (0.74), AUC SVM (0.73) |
The potential for a two-stage diabetes risk algorithm combining non-laboratory-based scores with subsequent routine non-fasting blood tests: results from prospective studies in older men and women [67] | LR | AUC LR (0.77) |
Sex Differences in Plasma MicroRNA Biomarkers of Early and Complicated Diabetes Mellitus in Israeli Arab and Jewish Patients [68] | Sequence Detection System (SDS) 2.3 (Applied Biosystems), Microsoft Excel, WinSTAT, StatPlus Mac LE (AnalystSoft, Walnut, CA, USA) software, and Student’s t-test. | Accuracy (0.77) and Sensitivity (0.79) |
Biochemical investigation of sex-specific association between insulin resistance and inflammatory biomarkers in types 2 diabetic patients [69] | Multiple linear regression, ANOVA | BMI in relation to insulin resistance: Pearson’s correlation coefficient 0.9188 in Males and 0.9694 in Females; coefficient of determination 0.8442 in Males and 0.9398 in Females; confidence interval 0.8349 to 0.9610 in Males and 0.9361 to 0.9855 in Females, for a p-value lower than 0.0001 in all cases |
This work | Feature selection: RFE with AIC, LASSO, and GA with GALGO. ML models: LR, KNN, Nearcent, ANN, and SVM | Table 26, Table 27, Table 28, Table 29, Table 30, Table 31 and Table 32 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Morgan-Benita, J.A.; Celaya-Padilla, J.M.; Luna-García, H.; Galván-Tejada, C.E.; Cruz, M.; Galván-Tejada, J.I.; Gamboa-Rosales, H.; Sánchez-Reyna, A.G.; Rondon, D.; Villalba-Condori, K.O. Setting Ranges in Potential Biomarkers for Type 2 Diabetes Mellitus Patients Early Detection By Sex—An Approach with Machine Learning Algorithms. Diagnostics 2024, 14, 1623. https://doi.org/10.3390/diagnostics14151623
Morgan-Benita JA, Celaya-Padilla JM, Luna-García H, Galván-Tejada CE, Cruz M, Galván-Tejada JI, Gamboa-Rosales H, Sánchez-Reyna AG, Rondon D, Villalba-Condori KO. Setting Ranges in Potential Biomarkers for Type 2 Diabetes Mellitus Patients Early Detection By Sex—An Approach with Machine Learning Algorithms. Diagnostics. 2024; 14(15):1623. https://doi.org/10.3390/diagnostics14151623
Chicago/Turabian StyleMorgan-Benita, Jorge A., José M. Celaya-Padilla, Huizilopoztli Luna-García, Carlos E. Galván-Tejada, Miguel Cruz, Jorge I. Galván-Tejada, Hamurabi Gamboa-Rosales, Ana G. Sánchez-Reyna, David Rondon, and Klinge O. Villalba-Condori. 2024. "Setting Ranges in Potential Biomarkers for Type 2 Diabetes Mellitus Patients Early Detection By Sex—An Approach with Machine Learning Algorithms" Diagnostics 14, no. 15: 1623. https://doi.org/10.3390/diagnostics14151623
APA StyleMorgan-Benita, J. A., Celaya-Padilla, J. M., Luna-García, H., Galván-Tejada, C. E., Cruz, M., Galván-Tejada, J. I., Gamboa-Rosales, H., Sánchez-Reyna, A. G., Rondon, D., & Villalba-Condori, K. O. (2024). Setting Ranges in Potential Biomarkers for Type 2 Diabetes Mellitus Patients Early Detection By Sex—An Approach with Machine Learning Algorithms. Diagnostics, 14(15), 1623. https://doi.org/10.3390/diagnostics14151623