The Diagnostic and Prognostic Value of the Triglyceride-Glucose Index in Metabolic Dysfunction-Associated Fatty Liver Disease (MAFLD): A Systematic Review and Meta-Analysis

Metabolic dysfunction-associated fatty liver disease (MAFLD) has been related to a series of harmful health consequences. The triglyceride-glucose index (TyG index) appears to be associated with MAFLD. However, no consistent conclusions about the TyG index and incident MAFLD have been reached. PubMed, MEDLINE, Web of Science, EMBASE and the Cochrane Library were searched. Sensitivities, specificities and the area under the receiver operating characteristic (AUC) with a random-effects model were used to assess the diagnostic performance of the TyG index in NAFLD/MAFLD participants. Potential threshold effects and publication bias were evaluated by Spearman’s correlation and Deeks’ asymmetry test, respectively. A total of 20 studies with 165725 MAFLD participants were included. The summary receiver operator characteristic (SROC) curve showed that the sensitivity, specificity and AUC were 0.73 (0.69–0.76), 0.67 (0.65, 0.70) and 0.75 (0.71–0.79), respectively. Threshold effects (r = 0.490, p < 0.05) were confirmed to exist. Subgroup analyses and meta-regression showed that some factors including country, number of samples, age and disease situation were the sources of heterogeneity (p < 0.05). Our meta-analysis suggests that the TyG index can diagnose and predict MAFLD patients with good accuracy. The number of studies remains limited, and prospective studies are needed.


Introduction
With rapid economic growth, changes in lifestyle and dietary structure, and the prevalence of obesity, nonalcoholic fatty liver disease (NAFLD) has emerged as the most common liver disorder and is the most important cause of chronic liver diseases, involving a spectrum of liver diseases from simple steatosis to nonalcoholic steatohepatitis (NASH), liver fibrosis, cirrhosis and hepatocellular carcinoma [1]. The global incidence of NAFLD has progressively increased, and a meta-analysis of the incidence of NAFLD in 22 countries confirmed that its current global occurrence was 25.2%. Meanwhile, regional differences exist. The prevalence of NAFLD is 27.37% in Asia, 24.13% in North America, 31.79% in the Middle East (the region with the highest incidence) and 13.48% in Africa (the region with the lowest incidence) [2]. Younossi reported approximately 52 million individuals with NAFLD throughout Germany, Italy, France and the UK, incurring a total cost of EUR 35 B yearly, and the per-person direct medical charges for these nations have been estimated to vary from EUR 354 to 1163 [3]. NAFLD is associated with type 2 diabetes mellitus, hypertension, dyslipidaemia and cardiovascular disease [4], thus causing a severe clinical and economic burden worldwide. In addition, most patients suffering from NAFLD Table 1. Search strategy.

NAFLD/MAFLD
Non-alcoholic fatty liver OR Non alcoholic Fatty Liver Disease OR Nonalcoholic Fatty Livers OR NAFLD OR Nonalcoholic Fatty Liver Disease OR Nonalcoholic OR Nonalcoholic Steatohepatitis OR nonalcohol-related fatty liver disease OR non-alcohol-related fatty liver disease OR non-alcohol related fatty liver disease OR Nonalcoholic Steatohepatitides OR NASH OR nonalcoholic fatty liver disease OR Nonalcoholic Fatty Liver Disease OR Nonalcoholic Steatohepatitis OR non-alcoholic steatohepatitis OR fatty liver OR NASH/non-alcoholic steatohepatitis OR nonalcohol-related fatty liver disease OR non-alcohol related fatty liver disease OR Metabolic dysfunction-associated fatty liver disease OR MAFLD OR MAFLD-related cirrhosis OR metabolic associated fatty liver disease TyG index triglyceride-glucose index OR triglyceride glucose index OR TyG index OR triglyceride and glucose index OR triglyceride-glucose (T/Gly) index OR TyGs OR triglyceride glucose indices OR The triglyceride-glucose index OR Triglyceride/glucose index OR Triglycerides and glucose index OR triglycerides/glucose Index (TyG Index)

Inclusion Criteria and Exclusion Criteria
The inclusion criteria were as follows: (a) subjects: patients diagnosed with NAFLD/MAFLD (based on the corresponding clinical guidelines, which mainly include imaging methods such as ultrasound and laboratory biochemical indicators, but where alcohol as the cause was excluded), and all participants underwent TG and FPG detec- tion; and (b) the diagnostic accuracy of the TyG index was reported, such as sensitivity, specificity, area under curve (AUC) and 95% confidence interval (CI). The exclusion criteria were as follows: (a) insufficient initial data to obtain true positives (TP), false positives (FP), true negatives (TN), false negatives (FN); (b) duplicate literature, in vitro studies and nonhuman studies; (c) reviews articles, editorials or letters; and (d) conference literature or case studies.

Data Extraction
Two researchers independently evaluated all relevant papers, extracted potentially eligible data, and discussed and resolved disagreements with relevant experts. The following data from all included relevant studies were extracted: (a) first author name, year of publication, country, study design, age, and number of participants; and (b) TP, FP, TN, FN, reference standards, cut-off values, AUCs and 95% CIs.

Quality Assessment
The methodological quality for the included literature used was evaluated independently using the Review Manager (Rev Man. Version 5.4. Copenhagen: The Nordic Cochrane Centre, The Cochrane Collaboration, 2014) with the aid of two investigators based completely on the Cochrane risk-of-bias criteria [12], which covered four domains (patient selection, index test, reference standard, flow and timing). Each domain was assessed in terms of risk of bias, and the first three domains were additionally assessed in terms of issues concerning applicability. Signaling questions were blanketed to assist in judging the risk of bias. The QUADAS-2 tool permitted a greater obvious ranking of bias and applicability of major diagnostic accuracy studies.

Statistical Analysis
Statistical analyses were conducted using the Meta-DISc version 1.4 and Stata (version 16.0, Stata Crop LLC, College Station, TX, USA). Heterogeneity was examined through Cochran's Q-value and Higgins I squared (I 2 ) [13,14]. I 2 values of 25%, 50% and 75% indicate low, moderate and high heterogeneity, respectively [15]. I 2 > 50% or p value < 0.05 indicated that heterogeneity existed, and a random-effects coefficient binary regression model was used [13]. Publication bias was visually assessed via Deeks' funnel plot and asymmetry test, and it was regarded as existing if there was a nonzero slope coefficient (p < 0.10) [16].
In the meta-analysis of diagnostic test accuracy, one of the principal reasons for heterogeneity is the threshold effect, which arises from one-of-a-kind cut-off values or thresholds used in exceptional research to outline a positive (or negative) test result [14]. When a threshold effect exists, there is a negative correlation between sensitivities and specificities (or a positive correlation between sensitivities and 1-specificities) [17], which results in a typical pattern of a "shoulder arm" in a receiver operating characteristic (ROC) curve. Potential threshold effects have been investigated with the aid of Spearman's correlation coefficient between the logit of sensitivity and logit of 1-specificity. A strong positive correlation would indicate a threshold effect, and a p value < 0.05 was considered statistically significant [14]. The heterogeneity of a study was due to the threshold effect; it was deemed appropriate to pool the accuracy records by fitting a summary receiver operating characteristic (SROC) curve [18]. A hierarchical summary receiver operating characteristic (HSROC) curve and bivariate boxplot were constructed with the summary points displayed.
The AUC was used to consider the capabilities of the TyG index to screen for MAFLD. An AUC of 0.75 to 0.92 is good. An AUC much less than 0.75 may additionally be reasonable; however, the test has evident shortcomings in its diagnostic accuracy [19].
The diagnostic odds ratio (DOR) test determines the ratio of the odds of positivity for efficaciously diagnosing a disease relative to the odds of positivity obtained in individuals with no sickness (false positivity). DOR is derived from logistic models, and it is feasible to consist of additional variables to correct for heterogeneity [20]. The Cochran Q-value of the DOR was used to test for non-threshold effects of variation. Using the strategies of restricted maximum likelihood estimation (REML) and inverse variance weighted least squares, we performed a subgroup analysis to find the source of heterogeneity. Design (retrospective or prospective), experiment (described experiment in detail or not clear), population (the population was described in detail versus approximately), country (China or other), number of samples (≥1000 or <1000), age (<18 or ≥18), baseline disease (yes or no), and reference standard (imaging examination or not) were used as the covariates. The factors observed to be strongly related to accuracy were then covered one by one in the bivariate model to examine the overall sensitivity and overall specificity between one-of-a-kind strata [15].
The AUC was used to consider the capabilities of the TyG index to screen for MAFLD. An AUC of 0.75 to 0.92 is good. An AUC much less than 0.75 may additionally be reasonable; however, the test has evident shortcomings in its diagnostic accuracy [19].
The diagnostic odds ratio (DOR) test determines the ratio of the odds of positivity for efficaciously diagnosing a disease relative to the odds of positivity obtained in individuals with no sickness (false positivity). DOR is derived from logistic models, and it is feasible to consist of additional variables to correct for heterogeneity [20]. The Cochran Q-value of the DOR was used to test for non-threshold effects of variation. Using the strategies of restricted maximum likelihood estimation (REML) and inverse variance weighted least squares, we performed a subgroup analysis to find the source of heterogeneity. Design (retrospective or prospective), experiment (described experiment in detail or not clear), population (the population was described in detail versus approximately), country (China or other), number of samples (≥1000 or <1000), age (<18 or ≥18), baseline disease (yes or no), and reference standard (imaging examination or not) were used as the covariates. The factors observed to be strongly related to accuracy were then covered one by one in the bivariate model to examine the overall sensitivity and overall specificity between one-ofa-kind strata [15].
The risk of bias within the included studies was assessed via QUDAS-2 ( Figure 2). According to QUDAS-2, all studies fulfilled more than 11 items from the 14-item QUDAS-2 checklist, which indicated overall high quality. Meanwhile, no significant publication bias was found (p = 0.81 for the slope coefficient), as shown in Figure 3.

Diagnostic Efficiency (Threshold Effect)
The results of our meta-analysis showed that the AUC of the ROC was 0.75. However, there was a positive correlation between sensitivities and 1-specificities in the ROC curve (r = 0.490, p = 0.015), resulting in a typical pattern of a "shoulder arm" plot, which indicated that a notable threshold effect existed ( Figure 4A) and caused variations in accuracy estimates among the individual studies. Therefore, SROC and HSROC analyses were performed. In the SROC analysis, pooled sensitivity, and specificity with 95% CI were 0.73 (0.69-0.76) and 0.67 (0.65, 0.70), respectively ( Figure 4B).

Diagnostic Efficiency (Threshold Effect)
The results of our meta-analysis showed that the AUC of the ROC was 0.75. However, there was a positive correlation between sensitivities and 1-specificities in the ROC curve (r = 0.490, p = 0.015), resulting in a typical pattern of a "shoulder arm" plot, which indicated that a notable threshold effect existed ( Figure 4A) and caused variations in accuracy estimates among the individual studies. Therefore, SROC and HSROC analyses were performed. In the SROC analysis, pooled sensitivity, and specificity with 95% CI were 0.73 (0.69-0.76) and 0.67 (0.65, 0.70), respectively ( Figure 4B).
In addition, the HSROC curve (green line) is shown in Figure 4C, suggesting that the HSROC curve was asymmetrical (β = −0.44, 95% CI = −0.84, −0.05, z = −2.21, p = 0.027) and that the diagnostic and prognostic value of the TyG index in MAFLD was accurate (lambda = 1.70, 95% CI = 1.54, 1.85). Circles indicate study estimates, boxes denote summary points, blue dashed lines indicate 95% prediction regions, and orange dashed lines denote 95% confidence regions. The bivariate boxplot of Figure 4D shows the distribution of the results of all included studies. All of these results showed that the TyG index could diagnose and predict MAFLD with good diagnostic efficiency (AUC = 0.75, 95% CI = 0.71, 0.79). Meanwhile, some studies have suggested that the TyG index is a powerful tool for diagnosing and predicting the outcome of infected diseases when combined with other markers, such as body mass index (BMI) and waist circumference (WC). The specific details of the diagnostic and prognostic value of the TyG index and TyG index-related parameters are shown in Table 3.  In addition, the HSROC curve (green line) is shown in Figure 4C, suggesting that the HSROC curve was asymmetrical (β = −0.44, 95% CI = −0.84, −0.05, z = −2.21, p = 0.027) and that the diagnostic and prognostic value of the TyG index in MAFLD was accurate (lambda = 1.70, 95% CI = 1.54, 1.85). Circles indicate study estimates, boxes denote summary points, blue dashed lines indicate 95% prediction regions, and orange dashed lines denote 95% confidence regions. The bivariate boxplot of Figure 4D shows the distribution of the results of all included studies. All of these results showed that the TyG index could diagnose and predict MAFLD with good diagnostic efficiency (AUC = 0.75, 95% CI = 0.71, 0.79). Meanwhile, some studies have suggested that the TyG index is a powerful tool for diagnosing and predicting the outcome of infected diseases when combined with other markers, such as body mass index (BMI) and waist circumference (WC). The specific details of the diagnostic and prognostic value of the TyG index and TyG index-related parameters are shown in Table 3.

Different Cut-Off Values of the TyG Index
The presence of threshold effects could lead to differences in sensitivity and specificity. The cut-off values of this meta-analysis ranged from 0.146 to 8.7 (the data in four studies were missing). To assess the diagnostic performance of the TyG index at different cut-off values, subgroup analyses were conducted (cut-off values of TyG: <6 (n = 3), 6-8 (n = 4), 8-8.5 (n = 6), and ≥8.5 (n = 6)). The detailed results are shown in Table 4. For the TyG cut-off < 6 group, the pooled sensitivity and specificity with 95% CI were 0.75 (0.71, 0.78) and 0.74 (0.72, 0.76), respectively. The AUC was 0.81 ± 0.01, and a high diagnostic value was found. Meanwhile, we found that with the increase in the TyG cut-off value, the diagnostic value of TyG for MAFLD decreased. Therefore, the determination of the optimal cut-off value needs to be combined with more data and specific clinical situations ( Figure 5, Table 4).

Different Cut-Off Values of the TyG Index
The presence of threshold effects could lead to differences in sensitivity and specificity. The cut-off values of this meta-analysis ranged from 0.146 to 8.7 (the data in four studies were missing). To assess the diagnostic performance of the TyG index at different cutoff values, subgroup analyses were conducted (cut-off values of TyG: <6 (n = 3), 6-8 (n = 4), 8-8.5 (n = 6), and ≥8.5 (n = 6)). The detailed results are shown in Table 4. For the TyG cut-off < 6 group, the pooled sensitivity and specificity with 95% CI were 0.75 (0.71, 0.78) and 0.74 (0.72, 0.76), respectively. The AUC was 0.81 ± 0.01, and a high diagnostic value was found. Meanwhile, we found that with the increase in the TyG cut-off value, the diagnostic value of TyG for MAFLD decreased. Therefore, the determination of the optimal cut-off value needs to be combined with more data and specific clinical situations ( Figure  5, Table 4).

Non-Threshold Effect
As shown in Figure 6 (DOR = 5.56, 95% CI = 4.41, 7.02, Q = 1618.50, p < 0.001), part of the heterogeneity could also be due to non-threshold effects. We further found the source of heterogeneity through sensitivity analysis, subgroup analysis and meta-regression. The results indicated that including the country, number of samples, age and disease situation were the sources of sensitivity and specificity heterogeneity (p < 0.05) (Figure 7). 0.81 ± 0.01 0.77 ± 0.01 0.75 ± 0.02 0.72 ± 0.02 PLR: Positive Likelihood Ratio; NLR: Negative Likelihood Ratio.

Non-Threshold Effect
As shown in Figure 6 (DOR = 5.56, 95% CI = 4.41, 7.02, Q = 1618.50, p < 0.001), part of the heterogeneity could also be due to non-threshold effects. We further found the source of heterogeneity through sensitivity analysis, subgroup analysis and meta-regression. The results indicated that including the country, number of samples, age and disease situation were the sources of sensitivity and specificity heterogeneity (p < 0.05) (Figure 7).

Discussion
Past studies have demonstrated that obesity, metabolic disorders, and environmental elements lead to the incidence and improvement of MAFLD. The prevalence of MAFLD is growing rapidly, bringing a host of adverse consequences [2,40,41]. A variety of indices have been found for the early detection of NAFLD, such as the fatty liver index, NashTest,

Discussion
Past studies have demonstrated that obesity, metabolic disorders, and environmental elements lead to the incidence and improvement of MAFLD. The prevalence of MAFLD is growing rapidly, bringing a host of adverse consequences [2,40,41]. A variety of indices have been found for the early detection of NAFLD, such as the fatty liver index, NashTest, hepatic steatosis index, SteatoTest, OxNASH score, aspartate aminotransferase/alanine aminotransferase (AST/ALT) ratio, enhanced liver fibrosis panel (ELF), aspartate amino-transferase to platelet ratio (APRI) and fibrosis-4 score (FIB-4) [42][43][44][45][46]. However, due to the complex calculation among variables and high cost, it is difficult for the above indices to be extensively used in clinical practice [47]. The mathematical model of the TyG index was first derived by Simental and other scholars for assessing the situation of IR [48][49][50]. As the calculation of TyG only requires triglyceride and fasting blood glucose, it is very suitable for large-scale epidemiological investigation [48]. Meanwhile, existing evidence suggests that triglycerides and fasting blood glucose are involved in the formation of fatty liver, and the theory of IR is considered to be important in the pathogenesis of MAFLD. Moreover, the TyG index is closely associated with NAFLD, and the TyG index is regarded as an effective, practical, and low-cost device to identify individuals at risk of hepatic steatosis with excessive sensitivity and specificity [47]. Therefore, the TyG index could be a good diagnostic index for MAFLD.
Existing studies have demonstrated the diagnostic value of some of the above indices in related liver diseases. A study revealed that the AUCs of the FIB-4 index, NAFLD fibrosis score (NFS) and BARD score for predicting advanced fibrosis in NAFLD were 0.744, 0.702 and 0.733, respectively [51]. In addition, a meta-analysis proposed that ARPI and FIB-4 should detect hepatitis B-related fibrosis with moderate sensitivity and accuracy [52]. However, given the limited sample size and power, we performed a meta-analysis to estimate the diagnostic value of the TyG index effectively. The results indicated that the TyG index had sensitivity (73%) and specificity (67%) in diagnosing MAFLD, with a pooled AUC of 0.75. Meanwhile, TyG index-related parameters are the combined parameters of the TyG index with BMI, WC, and waist-height ratio (WHtR), which were first reported by Ko et al. [53]. They proposed that TyG index-related parameters had the highest AUC value for predicting IR in contrast to visceral weight problems, lipid parameters and adipokines. Among them, TyG-BMI has the largest AUC in identifying IR, and it has also been found to be strongly associated with cardiovascular and cerebrovascular diseases such as hypertension and ischaemic stroke [54,55]. Additionally, compared with the TyG index and (homeostasis model assessment) HOMA-IR, TyG index-related parameters, including TyG-BMI and TyG-WC, showed better detection ability for MAFLD, liver fibrosis and moderate-to-advanced fibrosis, especially in younger people and diabetes patients [56].
Notably, TyG-BMI shows excellent predictive performance in detecting NAFLD in young and middle-aged people, and the correlation between liver fibrosis and TyG-BMI was stronger [35,57]. Importantly, as TyG-BMI is calculated based on FPG, TG and BMI, it is easy to obtain in the clinic, which is conducive to rapid promotion and application [57]. Our meta-analysis was consistent with previous results. However, due to the limited number of articles, more research focusing on TyG-related parameters is needed to assess its value.
Owing to the terrific threshold effect, fitting the SROC and HSROC curves was considered to be a more suitable method of evaluating the diagnostic value. Meanwhile, although the threshold effects could lead to the source of heterogeneity in our study, nonthreshold effects were also found. Meta-regression and subgroup analyses were performed to assess the potential impact factors, including population, disease, country, samples and age. Our results confirmed that these factors contributed to the existence of heterogeneity. In addition, the range of optimal cut-off values for the included studies was wide, possibly due to differences in sample size and age among the included studies. Sheng et al. proposed that the TyG index had higher predictive overall performance in the younger populace (age 18-30 years) [24].
Some inherent limitations existed in our study design and need to be viewed when deciphering our results. First, although we also found some studies from Mexico [47,58], Japan [59], and Brazil [60] that assessed the relationship between the TyG index and NAFLD, these studies were excluded due to the lack of necessary diagnostic data. All studies covered in our meta-analysis were from Asia, which may have potentially biased the results. Meanwhile, most of the studies included in this meta-analysis were retrospective observational studies, and the sample size of these studies varied widely. In addition, significant heterogeneity was discovered, and the threshold effect existed in our metaanalysis.

Conclusions
All currently available evidence from our meta-analysis suggests that the TyG index can diagnose and predict MAFLD patients with good accuracy. However, the number of studies remains limited, and prospective studies are needed to find a specific cut-off value of diagnosis.