Next Article in Journal
The Challenging Riddle about the Janus-Type Role of Hsp60 and Related Extracellular Vesicles and miRNAs in Carcinogenesis and the Promises of Its Solution
Next Article in Special Issue
Severity Classification of Parkinson’s Disease Based on Permutation-Variable Importance and Persistent Entropy
Previous Article in Journal
Prediction of Effective Thermal Conductivities of Four-Directional Carbon/Carbon Composites by Unit Cells with Different Sizes
Previous Article in Special Issue
Shadow Estimation for Ultrasound Images Using Auto-Encoding Structures and Synthetic Shadows
Open AccessArticle

Investigating Health-Related Features and Their Impact on the Prediction of Diabetes Using Machine Learning

1
Computer Science Department, College of Computer Sciences and Information Technology (CCSIT), King Faisal University, P.O. Box 400, Al-Ahsa 31982, Saudi Arabia
2
Computer Science Department, College of Computers and Information Technology (CCIT), Taif University, P.O. Box 11099, Taif 21944, Saudi Arabia
3
Information Systems Department, College of Computer Sciences and Information Technology (CCSIT), King Faisal University, P.O. Box 400, Al-Ahsa 31982, Saudi Arabia
*
Author to whom correspondence should be addressed.
Academic Editor: Jordi Solé-Casals
Appl. Sci. 2021, 11(3), 1173; https://doi.org/10.3390/app11031173
Received: 18 December 2020 / Revised: 20 January 2021 / Accepted: 23 January 2021 / Published: 27 January 2021
(This article belongs to the Special Issue Machine Learning Methods with Noisy, Incomplete or Small Datasets)
Diabetes Mellitus (DM) is one of the most common chronic diseases leading to severe health complications that may cause death. The disease influences individuals, community, and the government due to the continuous monitoring, lifelong commitment, and the cost of treatment. The World Health Organization (WHO) considers Saudi Arabia as one of the top 10 countries in diabetes prevalence across the world. Since most of its medical services are provided by the government, the cost of the treatment in terms of hospitals and clinical visits and lab tests represents a real burden due to the large scale of the disease. The ability to predict the diabetic status of a patient with only a handful of features can allow cost-effective, rapid, and widely-available screening of diabetes, thereby lessening the health and economic burden caused by diabetes alone. The goal of this paper is to investigate the prediction of diabetic patients and compare the role of HbA1c and FPG as input features. By using five different machine learning classifiers, and using feature elimination through feature permutation and hierarchical clustering, we established good performance for accuracy, precision, recall, and F1-score of the models on the dataset implying that our data or features are not bound to specific models. In addition, the consistent performance across all the evaluation metrics indicate that there was no trade-off or penalty among the evaluation metrics. Further analysis was performed on the data to identify the risk factors and their indirect impact on diabetes classification. Our analysis presented great agreement with the risk factors of diabetes and prediabetes stated by the American Diabetes Association (ADA) and other health institutions worldwide. We conclude that by performing analysis of the disease using selected features, important factors specific to the Saudi population can be identified, whose management can result in controlling the disease. We also provide some recommendations learned from this research. View Full-Text
Keywords: machine learning; prediction; feature importance; feature elimination; hierarchical clustering machine learning; prediction; feature importance; feature elimination; hierarchical clustering
Show Figures

Figure 1

MDPI and ACS Style

Ahmad, H.F.; Mukhtar, H.; Alaqail, H.; Seliaman, M.; Alhumam, A. Investigating Health-Related Features and Their Impact on the Prediction of Diabetes Using Machine Learning. Appl. Sci. 2021, 11, 1173. https://doi.org/10.3390/app11031173

AMA Style

Ahmad HF, Mukhtar H, Alaqail H, Seliaman M, Alhumam A. Investigating Health-Related Features and Their Impact on the Prediction of Diabetes Using Machine Learning. Applied Sciences. 2021; 11(3):1173. https://doi.org/10.3390/app11031173

Chicago/Turabian Style

Ahmad, Hafiz F.; Mukhtar, Hamid; Alaqail, Hesham; Seliaman, Mohamed; Alhumam, Abdulaziz. 2021. "Investigating Health-Related Features and Their Impact on the Prediction of Diabetes Using Machine Learning" Appl. Sci. 11, no. 3: 1173. https://doi.org/10.3390/app11031173

Find Other Styles
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Search more from Scilit
 
Search
Back to TopTop