However, NAFLD and NASH are complex multi factorial diseases and therefore no single surrogate marker is likely to be omniscient to predict clinical outcome or benefits of a therapy. Despite the fact that all biomarkers and scores have their limitations, interest is increasing rapidly in the use of these markers to predict information about progression and outcome of the disease. Therefore, respective surrogate biomarker and scores offered by the market should be used with much care and limited to situations where it has been demonstrated robust ability in disease management. In addition, there is an urgent need to improve standardization in the usage of these operations. On the other site, it is obvious that the surrogate markers can be extremely helpful when handled correctly. This was very recently demonstrated in a study using telemedicine-based comprehensive, continuous care intervention (CCl) together with carbohydrate restriction-induced ketosis and behavior changes. The respective study showed that a NAFLD liver fat score (i.e., N-LFS) was reduced in the CCl group, whereas it was not changed in a group of patients receiving usual care [
42]. This exemplarily demonstrates that surrogate markers can provide good measurement for the efficacy of a specific therapy.
Here we will summarize blood and serum biomarkers, which are already available and discuss their benefits and shortcomings in the diagnosis and management of NASH and NAFLD.
2.1. Steatosis
Hepatic steatosis is the key feature of NAFLD. Steatosis is diagnosed when more than 5% of hepatocytes contain fat or when the total amount of intrahepatic triglycerides is bigger than 5.5% without having any other liver disease in the patient’s history [
23,
24]. Today there is no specific serum marker to assess hepatic steatosis available. However, several reproducible blood biomarker panels and scores were developed to help diagnose NAFLD (
Table 1).
Most of these multiparametric panels include biochemical markers indicating liver damage or dysfunction (AST, ALT, bilirubin, γ-GT, platelet count, haptoglobin), lipid metabolism disorders (cholesterol, triglycerides), diabetes (HbA1c, fasting insulin level), inflammation (α
2M, ferritin), or provide information about matrix expression and turnover (TIMP-1, PIIINP, HA) (
Figure 3).
With an AUROC (area under the receiver-operating characteristic curve) accuracy value of 0.87, the NAFLD ridge score is currently one of the most efficient panel based on laboratory parameters. The NAFLD ridge score was developed as a machine learning algorithm facilitating registry research. It includes serum levels of alanine aminotransferase (ALT), high-density lipoprotein (HDL) cholesterol, triglycerides, hemoglobin A1c (HbA1c), leukocyte count, and the presence of hypertension [
43,
44]. With proton magnetic resonance spectroscopy (H-MRS) as reference, the NAFLD ridge score has a negative predictive value of 96%. However, this score is good to detect NAFLD but yet limited to the research setting and does not give the opportunity to distinguish between different steatosis grades or to assess changes during the development of steatosis over time.
A quantitative and by this more sensitive score to be calculated is the NAFLD Liver Fat Score (NLFS). This score includes the measurement of the liver fat content as determined by H-MRS, the presence or absence of the metabolic syndrome together with type 2 diabetes mellitus, aspartate aminotransferase (AST) levels, the AST:ALT ratio, and the fasting insulin serum level. With a sensitivity of 86% and a specificity of 71% the NLFS defines a liver fat content of more than 5.56% [
45]. A recent study from Ruiz-Tovar and colleagues tested the accuracy of the NLFS in patients one year after bariatric surgery and considered it to be the most accurate biochemical score to assess liver steatosis at the moment [
61]. The Hepatic Steatosis Index (HIS) also considers the AST/ALT ratio, BMI, diabetes and sex and has a sensitivity of 66% and a specificity of 69% [
46].
The fatty liver index (FLI) includes BMI, waist circumference and serum levels of triglycerides and the γ-glutamyltransferase (γ-GT). It could be shown that the FLI significantly correlates with insulin resistance [
47,
48]. The major drawback when rating HIS and FLI is that ultrasonography is used as the reference standard to diagnose fatty liver. This technique is in general dependent on the operator and thereby to some extent biased and insensitive if only mild steatosis is present. The lipid accumulation product index (LAP) first established by Bedogni et al. takes into account sex, serum triglyceride levels and weight circumference to evaluate lipid overaccumulation [
49].
A comparison of the accuracy in predicting NAFLD in a cross-sectional NAFLD cohort showed that NLFS is the best score to reliably predict NAFLD with an AUC of 0.771 [
62]. Although the presented scores are capable to indicate the presence of hepatic steatosis, there are several limitations given. To be critically considered are the facts that using these indices it is not possible to distinguish between different steatosis grades and detect and trace changes over time is not possible.
2.2. Steatohepatitis
The transition from simple hepatic steatosis to NASH is the most crucial step during the development of severe liver disease with poor prognosis and the higher risk to get fibrosis and progress to end-stage liver disease. Thus, the assessment of NASH and the possibility to distinguish between the dynamic changes from NAFLD to NASH are ongoing challenges. Precise diagnosis still depends on liver biopsy with huge variability between pathologists. For that reason, Bedossa et al. developed the Fatty Liver Inhibition of Progression (FLIP) algorithm, which requires pathologists to follow generalized criteria for scoring. The FLIP algorithm considers histologically steatosis, disease activity and fibrosis scores [
50]. Very recently, Canbay and colleagues established a novel machine learning approach to assess the severity of NAFLD and distinguish between NAFLD and NASH. In this study NAFLD was defined as the NAFLD activity score (NAS) ≤ 4 and NASH as NAS ≥ 4. With the help of an ensemble feature selection approach (EFS) they identified age, HbA1c, γ-GT, adiponectin and the apoptosis marker M30 to be the biomarkers highly associated with the prediction of NAFLD. The developed CHeK score, which is available at
http://CHek.heiderlab.de is not only able to detect NASH, but also to monitor the development from NAFLD to NASH and can be used to screen patients in a long-term follow up during disease progression or therapy [
51].
Besides histological scoring, the development from NAFLD to NASH involves a variety of different molecular, cellular and hormonal changes. Numerous blood biomarker and panels were investigated and developed trying to detect and reflect disease severity and underlying pathways. The apoptosis marker cytokeratin 18 (CK18) is a very well-studied individual blood biomarker so far. NASH patients show a significant increase of plasma CK18 indicating hepatocyte death through apoptosis and necroptosis compared to NAFLD patients [
63]. CK18 is the main intermediate filament protein in hepatocytes and is released upon the initiation of cell death [
64]. While the whole length CK18 is predominantly released upon hepatic necrosis, caspase cleaved CK18 (M30) is mainly produced by apoptotic cells [
65]. Although CK18 is considered to be one of the most promising biomarkers, several studies showed that the sensitivity to predict NASH is 66%, while the specificity is 82% [
66,
67]. In addition, the ability of M30 to predict NASH and distinguish between NAFLD and NASH was calculated as 0.82 [
68]. To increase the reliability of CK18 as a noninvasive biomarker for NASH a study shows that the combination with serum levels of the apoptosis-mediating surface antigen FAS (sFAS) further increases the accuracy [
69]. However, the optimal cut-off serum concentrations still vary between different studies and require further investigation.
NASH is predominantly characterized by pathological alterations in glucose and lipid metabolism. These alterations include modifications in adipokines (such as leptin, adiponectin and resistin) and liver-derived lipid hormones like the fibroblast growth factor 21 (FGF21), which is secreted upon peroxisome proliferator-activated receptor-α (PPARα) activation [
70,
71]. FGF21 was found to be significantly elevated in patients with mild to moderate hepatic steatosis. Serum levels were directly linked to increased intrahepatic triglyceride accumulation and liver damage [
72,
73]. However, FGF21 is known to also increase in sepsis and systemic inflammation [
74]. Further, adipokines were shown also to reflect visceral adiposity leading to a moderate specificity value of 62% with a specificity of 78% [
68]. Further studies even show a drop of FGF21 levels with increasing liver inflammation [
75].
The most evident difference between simple steatosis and advanced steatohepatitis is the absence of an inflammatory infiltrate. As a hallmark of NASH, a variety of inflammatory markers are elevated in patients with NASH, while disease is progressing. Increasing serum levels of C-reactive protein (CRP), tumor necrosis factor-α (TNF-α) and several interleukins such as, IL-6 and IL-8 were proposed as clinical markers. Although, they all correlate with the observed inflammatory status in NASH, none of them reached statistically significant values adjusted by the FDR on univariable analysis to be approved as a diagnostic marker yet because of their insensitivity to NASH specific inflammatory changes.
Recently, the transcription factor forkhead box protein A (FOXA1), also known as hepatocyte nuclear factor 3-α, was described as a potential new biomarker as it is involved in mediating homeostasis and metabolism by targeting genes in liver, adipose tissue and pancreas [
76]. Moya et al. could show that FOXA1 acts anti-steatotic by lowering fatty acid uptake and is suppressed in patients with NAFLD and insulin resistance [
77]. Therefore, the authors proposed this protein as sensitive noninvasive biomarker of liver fat accumulation, mitochondrial membrane potential and the production of reactive oxygen species (ROS). The limitation coming along with using a transcription factor as biomarker is, that FOXA1 is not secreted into the serum.
Oxidative stress, which is indicated by excessive ROS production, is one of the most important mechanisms underlying the disease pathogenesis of NASH finally leading to lipid oxidation and inflammation [
78]. Based on changes in lipid catabolism and de novo lipogenesis the oxNASH score was calculated including the linoleic acid:13-hydroxyoctadecadienoic acid (13-HODA) ratio together with the patient characteristics age, BMI and AST level. This score reached diagnostic accuracy with an AUROC 0.74–0.83 [
79]. Because mass spectroscopy is needed for the measurement of the described parameters, the oxNASH score is not commonly used today. In line with biomarkers targeting products, which are secreted due to an altered lipid metabolism, insulin-like growth factor binding protein 1 (IGFBP-1) was recently suggested as a potential serum marker for NAFLD and NAFLD-related fibrosis. It is exclusively upregulated in the liver in response to hepatic inflammation and oxidative stress and regulated by insulin [
80]. On this basis, Regué et al. could show that the global deletion of the insulin-like growth factor 2 mRNA-binding protein 2 (IGF2BP2 or IGF2 mRNA-binding protein 2, IMP-2) lead to a resistance to obesity and fatty liver in mice treated with a high fat diet (HFD) due to reduced adiposity [
81]. A limitation of those markers is that elevations might be not exclusively related to NAFLD-induced conditions, but also the metabolic syndrome and insulin resistance in general. Anyhow this is an interesting starting point for future investigations also in regard to therapeutic interventions and the understanding of the mechanisms that lead to steatosis.
The expression of ferritin is generally known to be increased in patients with NAFLD and metabolic syndrome. It was further shown to be independently associated with increased steatosis grades, NASH and NASH fibrosis with an AUROC of 0.62 [
82,
83]. This accuracy can be increased to an AUROC of 0.81 when including AST, BMI, type 2 diabetes, presence or absence of hypertension and platelet count to ferritin levels [
84]. The broad and long-lasting search for novel biomarkers to diagnose NASH, which are modestly accurate, show the multiple factors involved in NAFLD and the complexity of disease mechanisms. To date the combination of several biomarkers drastically increases diagnostic preciseness. Especially for NASH, panels like the Nash Test (NT) include baseline patient characteristics such as age, gender, height, weight and serum levels of triglycerides, cholesterol, transaminases, total bilirubin, α
2-macroglobulin, haptoglobin, apolipoprotein A1, γ-GT [
85].
Overall, most of the actual biomarkers and panels need further validation on cohorts with patients, including several different ethnicities and various starting points and outcomes. Up to now most validation studies work with patients undergoing bariatric surgery. Also choosing the best cut-off value for the specific serum markers is still not optimal. This points to the urgent need of basic research studies to help better understanding the underlying mechanisms and key molecules involved in the development of NAFLD and progression to NASH and end-stage liver disease.
2.3. Fibrosis
Studies show that the F2 stage of fibrosis is one of the most critical points in the progression from NASH and NASH fibrosis to end-stage liver disease, making it a crucial step for therapeutic intervention [
86,
87]. The risk of liver-specific mortality at stages F3 and F4 fibrosis is shown to increase by 50–80%. Thus, diagnosis and monitoring patients with noninvasive strategies is a major focus of actual research. Effective clinical NASH treatment is achieved when fibrosis progression is prevented and/or fibrosis is improved.
Most biomarkers do not measure fibrogenesis or fibrinolysis directly. Thus, those indirect surrogate markers show a low accuracy leading to the necessity of biomarker panels to improve their reliability on the discrimination between different fibrosis stages. The most common scores that combine several clinical parameters are the NAFLD Fibrosis Score (NFS), the Fibrosis-4 Score (FIB-4), the AST to Platelet Ratio Index (APRI) and the BARD Score, which includes BMI, AST:ALT ratio and diabetes.
The NFS includes several generally measured parameters and is well-studied in regards to its accuracy [
45]. Simple online calculation of the respective score can be done free of charge at
http://www.nafldscore.com/. Taking into account the AST:ALT ratio, albumin, platelet count, age, BMI and hyperglycemia, the NFS has a high predictive value, thereby avoiding the need of liver biopsy in many patients [
45]. Nevertheless, there are two different cutoff level described to either exclude or diagnose advanced fibrosis. This is leading to the problem that patients who end up with scores in between the two cutoff levels are not classified properly.
The FIB-4 index described in 2010 by McPherson et al. has an accuracy of AUROC 0.86 for advanced fibrosis and relies on the AST, ALT, platelet count and age [
53]. With a high negative predictive value of more than 90% and a positive predictive value of 82% the FIB-4 index is one of the reliable fibrosis scores to avoid liver fibrosis for diagnosis. Also, for the FIB-4 index there are two different cutoff level, i.e., a score <1.45 for moderate and >3.25 for advanced fibrosis [
88]. Both, the NFS and FIB-4 scores have been shown to be capable to predict decompensation in patients with NAFLD and NASH [
89,
90].
The BARD score, including the presence of type II diabetes, BMI and the AST:ALT ratio, comes with an AUROC of 0.81 to detect F3 fibrosis. Developed by Harrison et al. in 2008, this score has a high negative predictive value of 96% whereas the positive predictive value is modest [
55].
Very recently the MACK-3 was proposed as a marker for fibrotic NASH. MACK-3 includes the HOMA insulin resistance, AST and CK18 serum level. With an AUROC of 0.80 and a negative predictive value of 100% for fibrotic NASH and 74% for active NASH MACK-3 seems to be a promising score for future investigation and validation [
91].
Taken together, the scores that are actually available still have only moderate sensitivity and further investigation on noninvasive markers is urgently needed. Although all scores have comparable high negative predictive values and use common parameters measured during the general blood work so that they are easy to calculate and are definitely useful to screen patients, which are at risk to develop NAFLD related fibrosis and end-stage liver disease.
The measurement of specific fibrosis biomarkers in serum such as hyaluronic acid [
92], procollagen III amino-terminal peptide (PIIINP) type IV collagen [
93], TIMP-1 (tissue inhibitor of metalloproteinase 1) [
94] or laminin [
95] did not reach clinical routine, although they correlate with NASH and fibrosis with AUROC ranging from 0.87 (for hyaluronic acid) to 0.97 (for TIMP-1) [
96]. The reason for that is most likely that measurement is cost-intensive and technically complex.
Further developments in the field combine different serum parameters in complex algorithms such as the Enhanced Liver Fibrosis panel (ELF) [
56], FibroTest/FibroSURE/ActiTest [
58], FibroMeter NAFLD index [
59,
60], Hepascore [
57], and many others show very promising results to diagnose and distinguish patients with F0-F2 fibrosis from those with F3-F4 fibrosis. Those algorithms have to be validated in the clinics and have to be further developed and simplified to be able to make them widely applicable.
For the validation of a new diagnostic test method, the STARD checklist (Standards of Reporting of Diagnostic Accuracy Studies) was established and published by 13 journals in 2003 and modified to also meet the criteria needed for the evaluation of liver fibrosis in 2015. This Liver-FibroSTARD checklist should help to reach consent on the requirements for new noninvasive fibrosis markers [
97,
98].
However, it is obvious that the prediction of NASH severity by a noninvasive fibrosis marker, score, a diagnostic test, or an algorithm incorporating a panel of biomarkers is not necessarily capable of making a comprehensive statement of the disease outcome. Confounding factors, comorbidities, or simple blood parameters can significantly impact the progression or overall outcome of NASH. This was recently documented in a cross-sectional study in which 100 obese patients suffering from hepatic steatosis were analyzed for the occurrence of atherosclerosis [
99]. Interestingly, the authors found that a lowered copper bioavailability is linked to atherosclerosis, which is the main complication of NAFLD. In line, reduced hepatic copper concentrations were found in human NAFLD patients and associated with higher degrees of hepatic steatosis in rats fed with low dietary copper [
100].