Next Article in Journal
Associations between Greenspace and Gentrification-Related Sociodemographic and Housing Cost Changes in Major Metropolitan Areas across the United States
Previous Article in Journal
Association between Bullying Victimization and Symptoms of Depression among Adolescents: A Moderated Mediation Analysis
Article

Prediction of Type 2 Diabetes Based on Machine Learning Algorithm

Department of Information and Communications Engineering, Myongji University, 116 Myongji-ro, Yongin, Gyeonggi 17058, Korea
*
Author to whom correspondence should be addressed.
Academic Editor: Giuseppe Banfi
Int. J. Environ. Res. Public Health 2021, 18(6), 3317; https://doi.org/10.3390/ijerph18063317
Received: 2 February 2021 / Revised: 15 March 2021 / Accepted: 17 March 2021 / Published: 23 March 2021
Prediction of type 2 diabetes (T2D) occurrence allows a person at risk to take actions that can prevent onset or delay the progression of the disease. In this study, we developed a machine learning (ML) model to predict T2D occurrence in the following year (Y + 1) using variables in the current year (Y). The dataset for this study was collected at a private medical institute as electronic health records from 2013 to 2018. To construct the prediction model, key features were first selected using ANOVA tests, chi-squared tests, and recursive feature elimination methods. The resultant features were fasting plasma glucose (FPG), HbA1c, triglycerides, BMI, gamma-GTP, age, uric acid, sex, smoking, drinking, physical activity, and family history. We then employed logistic regression, random forest, support vector machine, XGBoost, and ensemble machine learning algorithms based on these variables to predict the outcome as normal (non-diabetic), prediabetes, or diabetes. Based on the experimental results, the performance of the prediction model proved to be reasonably good at forecasting the occurrence of T2D in the Korean population. The model can provide clinicians and patients with valuable predictive information on the likelihood of developing T2D. The cross-validation (CV) results showed that the ensemble models had a superior performance to that of the single models. The CV performance of the prediction models was improved by incorporating more medical history from the dataset. View Full-Text
Keywords: type 2 diabetes; machine learning; prediction type 2 diabetes; machine learning; prediction
Show Figures

Figure 1

MDPI and ACS Style

Deberneh, H.M.; Kim, I. Prediction of Type 2 Diabetes Based on Machine Learning Algorithm. Int. J. Environ. Res. Public Health 2021, 18, 3317. https://doi.org/10.3390/ijerph18063317

AMA Style

Deberneh HM, Kim I. Prediction of Type 2 Diabetes Based on Machine Learning Algorithm. International Journal of Environmental Research and Public Health. 2021; 18(6):3317. https://doi.org/10.3390/ijerph18063317

Chicago/Turabian Style

Deberneh, Henock M., and Intaek Kim. 2021. "Prediction of Type 2 Diabetes Based on Machine Learning Algorithm" International Journal of Environmental Research and Public Health 18, no. 6: 3317. https://doi.org/10.3390/ijerph18063317

Find Other Styles
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop