Development and Validation of an Insulin Resistance Model for a Population with Chronic Kidney Disease Using a Machine Learning Approach

Background: Chronic kidney disease (CKD) is a complex syndrome without a definitive treatment. For these patients, insulin resistance (IR) is associated with worse renal and patient outcomes. Until now, no predictive model using machine learning (ML) has been reported on IR in CKD patients. Methods: The CKD population studied was based on results from the National Health and Nutrition Examination Survey (NHANES) of the USA from 1999 to 2012. The homeostasis model assessment of IR (HOMA-IR) was used to assess insulin resistance. We began the model building process via the ML algorithm (random forest (RF), eXtreme Gradient Boosting (XGboost), logistic regression algorithms, and deep neural learning (DNN)). We compared different receiver operating characteristic (ROC) curves from different algorithms. Finally, we used SHAP values (SHapley Additive exPlanations) to explain how the different ML models worked. Results: In this study population, 71,916 participants were enrolled. Finally, we analyzed 1,229 of these participants. Their data were segregated into the IR group (HOMA IR > 3, n = 572) or non-IR group (HOMR IR ≤ 3, n = 657). In the validation group, RF had a higher accuracy (0.77), specificity (0.81), PPV (0.77), and NPV (0.77). In the test group, XGboost had a higher AUC of ROC (0.78). In addition, XGBoost also had a higher accuracy (0.7) and NPV (0.71). RF had a higher accuracy (0.7), specificity (0.78), and PPV (0.7). In the RF algorithm, the body mass index had a much larger impact on IR (0.1654), followed by triglyceride (0.0117), the daily calorie intake (0.0602), blood HDL value (0.0587), and age (0.0446). As for the SHAP value, in the RF algorithm, almost all features were well separated to show a positive or negative association with IR. Conclusion: This was the first study using ML to predict IR in patients with CKD. Our results showed that the RF algorithm had the best AUC of ROC and the best SHAP value differentiation. This was also the first study that included both macronutrients and micronutrients. We concluded that ML algorithms, particularly RF, can help determine risk factors and predict IR in patients with CKD.


Introduction
Chronic kidney disease (CKD) is defined by the presence of renal damage or reduced renal function lasting for at least three months, irrespective of the cause. The care of

Data Source: National Health and Nutrition Examination Survey (NHANES)
The National Health and Nutrition Examination Survey (NHANES) is one of a series of health-related programs in the USA conducted periodically by the Centers for Disease Control (CDC) and Prevention's National Center for Health Statistics (NCHS). Their data are released to the public for free. Our study protocol was approved by the research ethics review board at the NCHS, and all participants or proxies provided written informed consent. This large ongoing dietary survey was conducted to cross-sectionally assess the health and nutritional status of community-dwelling individuals in the USA. The examinations included anthropometric measurements, questionnaires on health and nutrition, and laboratory testing. Participants completed in-home interviews. We analyzed participants in the NHANES from 1999 to 2012. Participants were excluded from analyses if they aged <18 years, had no data on estimated glomerular filtration rate (GFR), or were without complete data on anthropometric measurements, questionnaires, and laboratory tests.
Histories were collected for diabetes mellitus (DM), cardiovascular disease (CVD), smoking [20], and hypertension through specific questionnaires: MCQ160C for CVD, MCQ220 for cancer or malignancies, and DIQ010 for DM [21]. Mortality records were provided by NCHS and NHAES. These surveys were created on a record matching NHANES and the National Death Index (NDI) death certificates, which is an NCHS centralized database of all deaths in the USA from 1979 onward. Causes of death were obtained from the NHANES.

Definition of Insulin Resistance (IR)
We used the homeostasis model assessment of IR (HOMA-IR) for assessing insulin resistance (IR). HOMA-IR is the most common method used to calculate IR. The outcome measure was IR, as indexed using HOMA-IR. Age, sex, race, year of assessment, BMI, and smoking status were covariates. The key potential confounding variable was waist circumference. Waist circumference was also used to test the presence of effect modification [23]. Increased HOMA-IR is known to strongly associate with the development of type 2 DM, statistically independent of impaired glucose tolerance status, obesity, and body fat distribution [24]. Higher values of HOMA-IR were independently associated with the risk of developing prediabetes [25]. HOMA-IR used the following formula to index insulin resistance: fasting plasma insulin (µU/mL) × fasting plasma glucose (mg/dL)/405 [7]. NHANES provided data on a participants' measures of fasting insulin and fasting glucose, as well as detailed assessment procedures [20,[26][27][28][29][30]. HOMA-IR varied with age: peaked at age 13 years in girls and at 15 years in boys. HOMA-IR value of 2.5 is an indicator of IR in adults [31]. However, no consensus was reached regarding cutoff values of HOMA-IR in patients of different disease disorders. For example, in CKD patients, the cutoff value of HOMA-IR varies from study to study, e.g., being 1.23 in a comparative study [32], 2.0 in another study for the evaluation of renal function deterioration [19], and 5.64 in nondiabetic nonobese patients with CKD [33]. HOMA-IR index values have been reported to be higher in predialysis and dialysis patient groups compared with controls [34]. Most patients with high HOMA-IR have higher incidences of IR. In one study regarding reference ranges of HOMA-IR in normal-weight and obese young Caucasians [35], if HOMA-IR cutoff point was 3.02, the AUROC was 0.73 (95% CI = 0.70-0.75) with 46.3% of sensitivity and 86.2% of specificity. Therefore, we set the cutoff value of IR in our CKD patients at values of HOMA-IR >3.

Model Building Process
We randomly selected 60% of the patient population as the training group. The remaining 40% were the testing group. The estimated proportion of IR was~50% in patients with poor physical fitness.
Target population was obtained in the training group. We used the technique of upsampling or downsampling for sample balance between the target and nontarget populations. Deep neural network (DNN), one of the deep learning methods, was used to estimate the first chosen method. Other traditional methods of ML algorithm, such as random forest (RF), eXtreme Gradient Boosting (XGboost), and logistic regression algorithms, were also used to compare accuracy with the DNN method. After completing the model training, the testing group was applied for validation. In the training group, ROC (receiver operating characteristic) curves from different algorithms were compared. Both the ROC curve and AUC (area under curve) were used for evaluating classification performance of different classifiers [36]. The targeted value of AUC (0.80) suggested that the model was adequate for predicting IR [37,38]. Finally, we used SHAP values (SHapley Additive exPlanations) to explain how different ML models worked [39].

Statistical Analysis
NHANES is a multiple complex survey design. To represent sample-weighted data and the difference between insulin resistance status, the weighted mean +95% confidence interval and weighted percentage were compared by weighted Chi-square test and weighted regression test. Weighted data were calculated according to analytical guidelines [40]. Moreover, original unweighted variables were used to perform model building of machine learning and deep learning. For unweighted data, continuous variables were reported as means ± SD and categorical data as numbers (percentages). Differences in clinical variables between insulin resistance statuses were tested by using the Chi-square test for categorical variables or paired t-test for continuous variables. All reported p-values were two-sided and considered significant with p < 0.05. All statistical analyses were performed using SAS for Windows (version 9.4; SAS, Cary, NC, USA). Deep learning algorithms and other ML (including XGBoost, random forest, and DNN) were conducted with Keras (version 2.4.0), TensorFlow (version 1.10.0), and Python (version 3.6.5).
This study was approved by the Ethics Committee of Taichung Veterans General Hospital, IRB number: CE20023A-2. Moreover, all methods were performed in accordance with the relevant guidelines and regulations.

Results
In this study, 71,916 participants were enrolled at the first stage. After exclusion (32,295 participants due to ages <18 years, 23,008 participants due to incomplete laboratory data, 634 participants due to incomplete nutrition data, and 14,750 due to being non-CKD participants), we had 1229 participants for the final analysis ( Figure 1). Of all participants with CKD, we randomly separated 675 participants for the train group and 185 for the validation group. In the training group, we trained the model via XGBoost, random forest, logistic regression, and DNN. AUCs were finally compared. A total of 369 participants was included in the testing group for model evaluation. CKD participants), we had 1229 participants for the final analysis ( Figure 1). Of all participants with CKD, we randomly separated 675 participants for the train group and 185 for the validation group. In the training group, we trained the model via XGBoost, random forest, logistic regression, and DNN. AUCs were finally compared. A total of 369 participants was included in the testing group for model evaluation. We separated all 1229 participants into three groups: 675 for train group, 185 for validation group, and 369 for test group. All algorithms were compared via AUC. After exclusion, all 1229 participants with CKD with complete data were analyzed. We separated all 1229 participants into three groups: 675 for train group, 185 for validation group, and 369 for test group. All algorithms were compared via AUC.

Relative Importance of Parameters in XGBoost and Random Forest (RF) Algorithms
The highest AUC score of the ROC curve out of the four algorithms was achieved using XGBoost, with RF being the second highest. For the XGBoost and RF prediction models, their relatively important features were obtained ( Figure 3A for XGBoost and Figure 3B for RF). The 32 features these models included were as follows: epidemiological domain (age, gender, and BMI), laboratory data domain (total cholesterol, triglyceride, and HDL), macronutrients domain (daily calorie intake, carbohydrate intake ration, protein intake ratio, protein intake amount, and fat intake ratio), and micronutrients domain (daily cholesterol intake, monounsaturated fatty acid intake, saturated fatty acid intake, selenium, sodium, phosphorus, potassium, zinc, polyunsaturated fatty acid, copper, caffeine, fiber, calcium, vitamin B 12 , vitamin B 6 , iron, vitamin C, magnesium, folate, theobromine, and alcohol). In the XGBoost algorithm ( Figure 3A), the BMI had a large impact on IR (0.1262), followed by triglyceride (0.0754), the protein intake ratio (0.0537), age (0.0487), blood total cholesterol value (0.0433), daily cholesterol intake (0.0430), daily calorie intake (0.0429), blood HDL value (0.0424), daily zinc intake (0.0420), and daily saturated fatty acid intake (0.0414). In the RF algorithm ( Figure 3B), similarly, the BMI had a large impact on IR (0.1654), followed by triglyceride (0.0117), the daily calorie intake (0.0602), blood HDL value (0.0587), and age (0.0446). These differences contributed to the AUC differences for the XGBoost and RF algorithms.
The relative importance from XGBoost and the RF algorithms did not show a positive and negative association of the selected features with IR ( Figure 4A for XGBoost and Figure 4B for RF). Some features were well separated to show a positive or negative association with IR in terms of the SHAP value for the XGBoost algorithm ( Figure 4A), including the blood triglyceride value (strongly positive impact), BMI (strongly positive impact), daily calorie intake (medium negative impact), and blood total cholesterol value (medium negative impact). For the RF algorithm ( Figure 4B), almost all features were well separated to show a positive or negative association with IR in terms of the SHAP value. In contrast, age for IR was not well discriminated against IR in the RF algorithm. SHAP values for all features are showed in Supplementary Figure S1 for the XGBoost algorithm and in Supplementary Figure S2 for the RF algorithm. blood total cholesterol value (0.0433), daily cholesterol intake (0.0430), daily c (0.0429), blood HDL value (0.0424), daily zinc intake (0.0420), and daily sa acid intake (0.0414). In the RF algorithm ( Figure 3B), similarly, the BMI had a on IR (0.1654), followed by triglyceride (0.0117), the daily calorie intake (0 HDL value (0.0587), and age (0.0446). These differences contributed to the ences for the XGBoost and RF algorithms.
(A) Feature importance using XGBoost. 14, x FOR PEER REVIEW (B) Feature importance using random forest (RF). The relative importance from XGBoost and the RF algorithms did not show and negative association of the selected features with IR ( Figure 4A for XGBoo ure 4B for RF). Some features were well separated to show a positive or negativ tion with IR in terms of the SHAP value for the XGBoost algorithm ( Figure 4A), the blood triglyceride value (strongly positive impact), BMI (strongly positiv

Discussion
IS and hyperinsulinemia are associated with CKD [41][42][43] and cardiorenal metabolic syndrome [44][45][46]. Various observational studies also reported the association between IR and the development of CKD independent of type 2 DM [15,47,48]. Data from NHANES also suggested a strong relationship between CKD and IR independent of type 2 DM [41,49,50]. Therefore, how to predict IR in patients with CKD is important to clinicians. In a study on a chronic renal insufficiency cohort (CRIC) without DM [51], a multivariableadjusted analysis showed many independent factors associated with a higher HOMA-IR, including age, no-smoking, BMI, waist circumference, hemoglobin, LDL, HDL,

Discussion
IS and hyperinsulinemia are associated with CKD [41][42][43] and cardiorenal metabolic syndrome [44][45][46]. Various observational studies also reported the association between IR and the development of CKD independent of type 2 DM [15,47,48]. Data from NHANES also suggested a strong relationship between CKD and IR independent of type 2 DM [41,49,50]. Therefore, how to predict IR in patients with CKD is important to clinicians. In a study on a chronic renal insufficiency cohort (CRIC) without DM [51], a multivariable-adjusted analysis showed many independent factors associated with a higher HOMA-IR, including age, no-smoking, BMI, waist circumference, hemoglobin, LDL, HDL, triglyceride, and C-reactive protein. Associated factors related to IR in CKD patients were reported. In addition to the above factors, the micronutrients were well reviewed and their altered levels were associated with the trajectory toward IR, DM, oxidative stress, and provided diseaserelevant information [52]. Until now, no study had been published on the association between micronutrients and IR in CKD patients. Our present study was the first one on such an association in a CKD population. Moreover, this was also the first study using AI to develop a predictive model for IR in CKD patients. In a review article [53], principles of ML are considered as building algorithms to support predictive models for the outcome. AI can also introduce a paradigm shift in disease care from conventional treatment strategies to building targeted data-driven individualized care.
For our predictive models in the validation group, RF had the highest AUC of ROC (0.83), specificity (0.81), PPV (0.77), and NPV (0.77). Similarly, in the testing group, the AUC of ROC of the RF algorithm almost had the highest values (0.77), second to the XGboost algorithm (AUC of ROC, 0.78). In general, both MLs (XGBoost and RF) appeared to be better predictive models compared with the logistic regression. However, between XGBoost and RF, we preferred the RF algorithm, because of its easier explanations of feature impacts on the IR, based on the SHAP value. SHAP is an extended Shapley value in cooperative game theory. It is used to calculate contributions of features in ML. The SHAP value has been widely used for evaluating the impacts of contributions of each feature from predictive models, such as the network-pharmacokinetic model [54], extubation failure in intensive care units [55], multivariate molecular diagnostic test [56], and the factors associated with the rapid treatment of sepsis [57]. In our present study, the positive and negative SHAP values could be separated more clearly in the RF algorithm ( Figure 4B) than in the XGBoost algorithm ( Figure 4A). Detailed information on the SHAP values of all features in the RF algorithm can be found in Supplementary Figure S2. With increasing values of a feature (increasing pattern), the SHAP value also increased. Such features included the BMI, blood triglyceride, ratio of protein intake, total cholesterol intake, phosphorus intake, magnesium intake, zinc intake, copper intake, and selenium intake. However, with decreasing values of a feature (decreasing pattern), the SHAP value decreased. Such features included the blood HDL, blood total cholesterol, total protein amount, calorie intake, ratio of carbohydrate intake ratio, folate intake, vitamin B6 intake, calcium intake, and caffeine intake. Some features had a "J-shape pattern", including the saturated fatty acid intake, polyunsaturated fatty acid intake, fat intake ratio, saturated fatty acid intake, monounsaturated fatty acid intake, and polyunsaturated fatty acid intake. Interestingly, all features with a U-shape pattern belonged to fat-related factors. Even if those micronutrients had a low impact of feature importance ( Figure 3B), they still had a specific pattern of impact on the IR. This condition cannot be identified using traditional statistical analyses.
In the RF algorithm ( Figure 3B), the BMI and blood triglyceride value were important features with a large impact on IR, and they presented an "increasing pattern" (Supplementary Figure S2). The BMI and blood triglyceride levels were also reported to be associated with IR in the CKD cohort [51]. IR could cause the overproduction of every LDL, and that contributes to hypertriglyceridemia [58], which further induces the progression of renal dysfunction [59]. Similarly, in a prospectively study, serum triglyceride levels were found to be higher in patients who progressed in nephropathy compared with those who did not (median 1.21 (range 0.41-2.96) vs. 0.91 (0.31-11.07) mmol/L; p = 0.0037) [60]. The above results were consistent with our present evidence on the importance of the serum triglyceride level on IR development in CKD patients.
Fat-related nutrients (saturated fatty acid intake, polyunsaturated fatty acid intake, fat intake ratio, saturated fatty acid intake, monounsaturated fatty acid intake, and polyunsaturated fatty acid intake) had a J-shape pattern, which indicated too many or too few of the fat-related nutrients were associated with an increased pattern in this CKD cohort. Fatty acid-medicated IR can be found in various organs [61]. In another review article [62], IR in CKD patients was strongly associated with free fatty acid levels. Tumor necrosis factor alpha can activate adipose tissue lipolysis, generating free fatty acid. In muscle cells, free fatty acid may further activate many transcription factors and downstream signal transduction pathways. Finally, this pathway causes IR in CKD patients. In addition, the changes in the plasma free fatty acid level correlated linearly with intramyocellular triglyceride (r = 0.74, p < 0.003) [63]. The free fatty acid impairment of insulin sensitivity has been repeatedly reported [64]. The composition of a fatty acid diet could have a significant role in modulating IR. Interestingly, in our study with ML, we found that the intake of very low fatty acid-related nutrients also mildly increased IR. The association between fatty acid and IR in CKD patients was kind of similar to a J shape, rather than a linear increasing pattern. This could be partially explained by malnutrition and inflammation, two important players in the reverse epidemiology of this population [65], and they both counteract insulin sensitivity [66].
In this model with ML, we also found that in our patients, micronutrients were associated with the IR, including an increasing pattern (copper intake and selenium intake), and a decreasing pattern (caffeine intake).
The greater the intake of copper (>1 mg/day), the more significantly IR increased (SHAP value of RF in Supplementary Figure S2). A study on obese Malaysian adults [67] reported a significant positive association between dietary copper and HOMA-IR with intakes of Cu ≥ 13.4 µg/kg/day, 0.276 (CI = 0.025-0.526; p = 0.033). Excess copper intake might create oxidative stress, which further favors the progression of T2DM [68].
The greater the intake of selenium (>50 mg/day), the more significantly IR increased (SHAP value of RF in Supplementary Figure S2). A cross-sectional study on 5423 middleaged and elderly Chinese participants [69] reported a strong positive correlation between selenium intake and DM. Other cross-sectional surveys on 8876 adults in the US NHNES also showed a positive correlation of a higher serum selenium value and DM [70][71][72]. A hospital-based case-control study on 847 adults [73] showed that the odd ratios of having DM in the second, third, and fourth selenium quartile groups were 1.24 (95% CI 0.78 to 1.98, p > 0.05), 1.90 (95% CI 1.22 to 2.97, p < 0.05), and 5.11 (95% CI 3.27 to 8.00, p < 0.001), respectively, after being adjusted for age, gender, current smoking, current drinking, and physical activity. The recommended daily requirement of selenium is 55 mg for adults [52], consistent with our findings of no more than 50 mg.
Regarding the caffeine intake, the greater the intake (>200 mg/day), the less there was IR as based on the SHAP value. In a rat model, a chronic caffeine intake reversed aging-induced IR by lowering NEFA production and increasing Glut4 expression in skeletal muscles [74]. Caffeine reduces the production of superoxide and the expression of the receptor of advanced glycation end-product at the nucleus tractus solitarii, though it enhances insulin receptor substrate 1-phosphatidylinositol 3-kinase-Akt-neuronal nitric oxide synthase signaling [75]. In another rat model of a high-fat and high-sucrose diet [76], longterm caffeine intake prevented IR, a result related to a drop in circulating catecholamines. In a US NHNES survey (2009-2010 and 2011-2012), caffeine and its metabolites were found to be positively related to IR and beta cell function [77]. In our predictive model with ML, we also found the importance of caffeine intake on IR.
The strength of this study was its novelty, namely, being the first study to investigate predictive models of IR in patients with CKD via ML. Moreover, compared with other studies on predictive models of AI, this was the first study to have enrolled so many macronutrients and micronutrients. Using SHAP values, the impact of feature importance on IR could be better explained. Importantly, the clinical implication of this study was that our predictive model via ML could provide, on a daily basis, an individual warning in our clinical practice. The IR in CKD was strongly associated with atherosclerosis. After this study, we can benefit from the early prediction of IR in patients with CKD to avoid further CVD and functional deteriorations. In the future, we aim to validate this algorithm in another large database via federated learning. We also plan to correlate the association of IR in CKD to clinical outcomes, including CVD, renal function deterioration, and all-cause mortality via machine learning.
There were several limitations in our study. First, we did not have features of genetic data. Second, data in this study were from cross-sectional surveys only. Third, we did not have enough data to separate patients into different stages of CKD. However, in the current form of the dataset, we believe our predictive model of IR in patients with CKD can explain IR based on features, which can be obtained in clinical practice. In the future, we may need to enroll even more features, such as genetic factors, to train our predictive model better.

Conclusions
This was the first study using ML to predict IR in patients with CKD. In our study, the RF algorithm had the best AUC of ROC and the best differentiation of SHAP values. This was also the first study including both macronutrients and micronutrients. We concluded that ML algorithms, particularly RF, can help determine risk factors and predict IR in patients with CKD.