Plasma Metabolite Signatures in Male Carriers of Genetic Variants Associated with Non-Alcoholic Fatty Liver Disease

Both genetic and non-genetic factors are important in the pathophysiology of non-alcoholic fatty liver disease (NAFLD). The aim of our study was to identify novel metabolites and pathways associated with NAFLD by including both genetic and non-genetic factors in statistical analyses. We genotyped six genetic variants in the PNPLA3, TM6SF2, MBOAT7, GCKR, PPP1R3B, and HSD17B13 genes reported to be associated with NAFLD. Non-targeted metabolomic profiling was performed from plasma samples. We applied a previously validated fatty liver index to identify participants with NAFLD. First, we associated the six genetic variants with 1098 metabolites in 2 339 men without NAFLD to determine the effects of the genetic variants on metabolites, and then in 2 535 men with NAFLD to determine the joint effects of genetic variants and non-genetic factors on metabolites. We identified several novel metabolites and metabolic pathways, especially for PNPLA3, GCKR, and PPP1R38 variants relevant to the pathophysiology of NAFLD. Importantly, we showed that each genetic variant for NAFLD had a specific metabolite signature. The plasma metabolite signature was unique for each genetic variant, suggesting that several metabolites and different pathways are involved in the risk of NAFLD. The FLI index reliably identifies metabolites for NAFLD in large population-based studies.


Introduction
Non-alcoholic fatty liver disease (NAFLD) is characterized by lipid deposition (>5% of liver weight) in the liver not related to alcohol consumption. NAFLD is the most common manifestation of chronic liver disease in Western countries and is predicted to become the most common cause of liver transplantation by 2030 [1]. Excess accumulation of triacylglycerols (TAGs) in the liver can proceed to non-alcoholic steatohepatitis (NASH) and even to hepatocellular carcinoma. NAFLD also affects extra-hepatic organs and metabolic pathways [2] and increases the risk of type 2 diabetes, cardiovascular disease, and chronic kidney disease [3].
Both genetic and non-genetic factors are important in the pathophysiology of NAFLD [4]. Several genetic variants have been reported to be associated with NAFLD in genomewide association studies. A nonsynonymous rs738409-G variant in PNPLA3 (patatin-like phospholipase domain-containing 3) encodes an amino acid substitution of I148M that is recognized as the most important genetic variant for the risk of NAFLD and NASH [5][6][7]. PNPLA3 has TAG hydrolysis activity, and the accumulation of inactive PNPLA3 leads to an increase in TAGs in the liver [6,8].
TM6SF2 (transmembrane 6 superfamily member 2) encodes the TM6SF2 protein and is involved in the assembly and lipidation of very-low-density lipoprotein (VLDL) particles [9,10]. MBOAT7 (membrane-bound O-acyltransferase domain-containing 7 gene) encodes for the MBOAT7 protein which is an enzyme involved in the remodeling of The participants were recruited from the METSIM study, comprising 10,197 Finnish men randomly selected from the population register of Kuopio town, eastern Finland, aged from 45 to 73 years, and examined in 2005-2010. The study design has been described previously in detail [21,22]. The study was approved by the Ethics Committee of the University of Kuopio and Kuopio University Hospital. All study participants gave written informed consent. The study protocol follows the ethical guidelines of the Declaration of Helsinki, as reflected in a priori approval by the institution's human research committee.

Clinical and Laboratory Measurements
Height was measured without shoes to the nearest 0.5 cm. Weight was measured in light clothing with a calibrated digital scale (Seca 877, Hamburg, Germany). BMI was calculated as weight (kg) divided by height (m) squared. We measured waist circumference at the midpoint between the lateral iliac crest and the lowest rib. Laboratory studies after 12 h fasting included the following measurements: plasma glucose and insulin, lipids, lipoproteins, and mass spectrometry metabolomics (Metabolon, Durham, NC, USA). An oral glucose tolerance test was performed (75 g of glucose) to evaluate glucose tolerance according to American Diabetes Association criteria [23]. Clinical and laboratory measurement methods have been previously described [22]. Briefly, plasma glucose was measured by enzymatic hexokinase photometric assay (Konelab Systems Reagents, Thermo Fischer Scientific, Vantaa, Finland). Insulin was determined by immunoassay (ADVIA Centaur Insulin IRI, no 02230141, Siemens Medical Solutions Diagnostics, Tarrytown, NY, USA). Total TAGs, plasma free fatty acids (FFAs), high-density lipoprotein cholesterol (HDLC), and lowdensity lipoprotein cholesterol (LDLC) were measured using enzymatic colorimetric tests (Konelab Systems Reagents; Thermo Fisher Scientific, Vantaa, Finland). Plasma adiponectin was measured using ELISA (human adiponectin ELISA kit; Linco Research, St. Charles, MI, USA), and serum alanine aminotransferase (ALT) and gamma-glutamyl transferase (GGT) by enzymatic photometric tests (Konelab Reagent System, Thermo Fisher Scientific, Vantaa, Finland). High-sensitivity C-reactive protein (hs-CRP) was assayed using kinetic immunoturbidimetry (near-infrared particle immunoassay; IMMAGE Immunochemistry System; Beckman Coulter, Fullerton, CA, USA). DNA was extracted from white blood cells.

Metabolomics Analysis
Non-targeted metabolomic profiling was performed at Metabolon, Inc. (Durham, NC, USA) on EDTA plasma samples obtained after an overnight fast. Briefly, methanol extraction of biochemicals was followed by non-targeted relative quantitative liquid chromatography-tandem mass spectrometry (LC-MS/MS). The Metabolon DiscoveryHD4 platform was applied to assay named metabolites. Samples were randomized across batches. Batches contained~144 METSIM samples and 20 well-characterized human-EDTA plasma samples for quality control. All samples were processed together for peak quantification and data scaling. We quantified raw mass spectrometry peaks for each metabolite using the area under the curve and evaluated overall process variability using the medianrelative standard deviation for endogenous metabolites present in all 20 technical replicates in each batch. We adjusted for variation caused by day-to-day instrument tuning differences and columns used for biochemical extraction by scaling the raw peak quantifications to the median for each metabolite by the Metabolon batch. We included in the statistical analysis 1098 metabolites, having data for at least 1000 men.

Genetic Analysis
We genotyped the genetic variants PNPLA3 rs738409-G, TM6SF2 rs58542926-T, MBOAT7 rs641738-T, GCKR rs780094-T, PPP1R3B rs4841132-A, and HSD17B13 rs72613567:TA using specific TaqMan assays (ThermoFisher) in a 7500 Real-Time PCR System (Applied Biosystems) or the Sequenom iPlex Gold SBE assay at the National Human Genome Research Institute at the National Institutes of Health as previously described [25].

Statistical Analysis
All statistical analyses were performed using IBM SPSS Statistics 27. All variables were log-transformed to correct for their skewed distribution. p < 4.5 × 10 −5 was considered statistically significant given the 1098 metabolites measured and 6 genetic variants included in statistical analyses. We used ANOVA to calculate statistical differences in clinical and laboratory parameters between the men without (FLI < 30) and with NAFLD (FLI > 80). We applied linear regression analysis between the risk alleles of PNPLA3 rs738409-G, TM6SF2 rs58542926-T, MBOAT7 rs641738-T, GCKR rs780094-T, PPP1R3B rs4841132-A, and HSD17B13 rs72613567:TA and the metabolites (N = 1098) in men without and with NAFLD. The results are given as standardized beta coefficients, SE, and p values with each metabolite as a dependent variable. We used the Z-test to compare the statistical significance between two betas from linear regression [26].

Clinical and Laboratory Measurements
Except for age, all other laboratory and clinical parameters showed highly statistically significant differences between men without NAFLD (FLI 18.9 ± 7.0) and with NAFLD (FLI 90.5 ± 5.8) ( Table 1). Men with NAFLD were more obese and more centrally obese, had higher concentrations of ALT, LDLC, TAG, glucose, insulin, hs-CRP, and FFAs, and more often had type 2 diabetes than men without NAFLD. We associated six genetic variants associated with NAFLD with clinical and laboratory measurements (Supplementary Table S1). Only a few statistically significant associations were found in men without NAFLD. The PNPLA3 variant was significantly associated with decreased fasting insulin, the GCKR variant with increased TAGs, and the PPP1R3B variant with decreased HDLC and fasting FFAs. In men with NAFLD, the PNPLA3 variant was associated with increased ALT, the TM6SF2 variant with decreased TAGs, the GCKR variant with increased TAGs, the PPP1R3B variant with decreased HDLC and decreased FFAs, and the HSD17B13 variant with decreased ALT and TAGs. None of these genetic variants were associated with insulin Matsuda ISI, obesity, or glucose or insulin concentrations.
MBOAT7 rs641738-T. In men without NAFLD, we found positive associations of the MBOAT7 variant with seven glycerophospholipids (four phosphatidylinositols (PIs) and three lyso-phosphatidylinositols (lyso-PIs)) and three inverse associations with two PIs and one lyso-PI, all containing arachidonic acid in their acyl chain ( Table 2). In men with NAFLD, we found statistically significant associations with the same metabolites as in men without NAFLD (Supplementary Table S2). Beta coefficients were not significantly different between men with and without NAFLD, which suggests that we did not find any specific metabolites associated with NAFLD.

Discussion
Our study is the first large population-based study investigating the association of six genetic variants increasing (PNPLA3, TM6SF2, GCKR, MBOAT7) and two genetic variants decreasing (PPP1R3B, HDS17B13) the risk of NAFLD. We identified several novel variantmetabolite associations in different metabolic pathways relevant to the pathophysiology of NAFLD. Importantly, we showed that each genetic variant for NAFLD has a specific metabolite signature.
NAFLD is explained by both genetic and non-genetic factors. Among non-genetic factors, obesity, insulin resistance, and type 2 diabetes play important roles in the risk of NAFLD [19]. None of the six genetic variants for NAFLD were associated with obesity, hyperglycemia, or insulin resistance in our study. This implies that obesity, insulin resistance, and hyperglycemia are regulated by metabolic changes independently of genetic factors.
PNPLA3 rs738409-G. The PNPLA3 variant is the most important genetic variant associated with NAFLD. A recent study has shown that the steatosis associated with the PNPLA3 variant is caused by the accumulation of PNPLA3 protein on lipid droplets [27]. In men with NAFLD, we found 12 novel associations of the PNPLA3 variant with metabolites belonging to different metabolic pathways. We also found decreased levels of N-acetylmethionine (NAM) in carriers of the PNPLA3 variant having NAFLD, similarly as has been published for patients with NASH [28].
The most significant positive association of the PNPLA3 variant was with 3-ureidopropi onate (3-UPA), a metabolite in pyrimidine degradation (Figure 2A). 3-UPA has not previously been linked to NAFLD in human studies, but, in mice, liver pyrimidine degradation was associated with lipid accumulation in the liver [29]. The β-ureidopropionase enzyme catalyzes the conversion of 3-UPA to β-alanine [30][31][32]. High concentrations of 3-UPA increase reactive oxygen species (ROS) production and inhibit mitochondrial energy metabolism in complex V (ATP synthase) [33]. Decreased activity of mitochondrial respiratory complexes I, III, IV and V has been previously reported in NASH [34,35]. Our novel findings were also that the PNPLA3 variant had significant positive associations with downstream metabolites of spermidine catabolism, N-acetylspermidine, and N-acetylisoputreanine in men with NAFLD ( Figure 2B). We also found in men with NAFLD increased levels of 12,13-diHOME ( Figure 2C), tricarboxylic acid aconitinate, a metabolite involved in the mitochondrial tricarboxylic acid (TCA) cycle, and carboximidic acid N1/N8-acetylspermidine ( Figure 2D), as well as elevated concentrations of two bile acids, taurochenodeoxycholate and glycohenodeoxycholate. A previous study has reported a significant increase in the fasting concentration of taurochenodeoxycholate in patients with NASH [36,37]. of mitochondrial respiratory complexes I, III, IV and V has been previously reported in NASH [34,35]. Our novel findings were also that the PNPLA3 variant had significant positive associations with downstream metabolites of spermidine catabolism, Nacetylspermidine, and N-acetylisoputreanine in men with NAFLD ( Figure 2B). We also found in men with NAFLD increased levels of 12,13-diHOME ( Figure 2C), tricarboxylic acid aconitinate, a metabolite involved in the mitochondrial tricarboxylic acid (TCA) cycle, and carboximidic acid N1/N8-acetylspermidine ( Figure 2D), as well as elevated concentrations of two bile acids, taurochenodeoxycholate and glycohenodeoxycholate. A previous study has reported a significant increase in the fasting concentration of taurochenodeoxycholate in patients with NASH [36,37].

Figure 2.
Novel metabolites associated with PNPLA3 rs738409-G in men with NAFLD. Red color indicates increased concentration of the metabolites. Grey color indicates that these metabolites were not significantly associated with PNPLA3 rs738409-G. A. metabolites belonging to the uracil pathway. B. downstream metabolites belonging to the spermidine pathway. C. downstream metabolites of the linoleic acid pathway. D. metabolites coming from glycolysis pathway. Abbreviations: EpOME, epoxyoctadecenoic acid; 12(13)DiHOME, 12,13-dihydroxy-9-octadecenoic acid.
TM6SF2 rs58542926-T. We found statistically significant novel inverse associations of the TM6SF2 variant with three diacylglycerols (DAGs) and two phosphatidylcholines (PCs) in men with NAFLD. These lipids have PUFAs in their acyl chain. Our findings agree with a previous study showing that the TM6SF2 variant contributes to steatosis by increasing the synthesis of PUFAs, whereas the synthesis of complex lipids containing TM6SF2 rs58542926-T. We found statistically significant novel inverse associations of the TM6SF2 variant with three diacylglycerols (DAGs) and two phosphatidylcholines (PCs) in men with NAFLD. These lipids have PUFAs in their acyl chain. Our findings agree with a previous study showing that the TM6SF2 variant contributes to steatosis by increasing the synthesis of PUFAs, whereas the synthesis of complex lipids containing PUFAs is impaired [38]. We also confirmed that the TM6SF2 variant was significantly and inversely associated with TAG concentrations [39]. Our results support the notion that the TM6SF2 variant is involved in the development of NAFLD by affecting the fatty acyl chain composition of PCs, impairing VLDL assembling and secretion, and consequently promoting lipid accumulation and inflammation in the liver [38].
MBOAT7 rs641738C>T. MBOAT7 participates in acyl chain remodeling of PIs [40]. We found novel statistically significant positive associations with four PIs and three lyso-PIs and novel inverse associations with two PIs and one lyso-PI in men both with and without NAFLD, suggesting that these associations were largely explained by the MBOAT7 variant. All of these lipids contained arachidonic acid in their acyl chain and were inversely associated with the MBOAT7 variant. Elevated concentrations of circulating proinflammatory metabolites of arachidonic acid and lyso-PIs have been reported in patients with NASH [41,42]. Moreover, lyso-PIs activate stellate cell compartments and trigger fibrosis in the liver [43].
GCKR rs780094-T. The GCKR variant was significantly and inversely associated with mannose in men with and without NAFLD, suggesting that this association is explained largely by genetic factors. Mannose is an indicator of hepatic glycogen breakdown, and it may reflect hepatic GCK and GCKR activity [44,45]. We found 28 novel significant associations of the GCKR variant with metabolites (mainly glycerolipids, glycerophospholipids, and amino acids) in men with NAFLD, and replicated previously reported associations with pyruvate, lactate, threonine, 3-aminoisobutyrate, glycerolipids, and glycerophospholipids [46]. Importantly, many of the glycerolipids and glycerophospholipids have palmitic acid in their side chain, supporting the role of the GCKR variant in the pathogenesis of NAFLD via generation of malonyl-CoA and an increase in de novo lipogenesis [47]. In addition, a recent study demonstrated that a high hepatic cytosolic NADH/NAD+ ratio stimulates fatty acid synthesis resulting in accumulation of TAGs in the liver; inhibits gluconeogenesis by preventing the oxidation of lactate to pyruvate [48]; and increases the generation of ROS due to the leak-out of electrons from mitochondrial complexes [49].
We found novel positive associations of the GCKR variant with branched-chain amino acids (leucine, isoleucine, and valine) and their downstream metabolites (3-methyl-2-oxovaleriate, carboxyethyl-leucine, carboxyethyl-isoleucine, and carboxyethyl-valine) ( Figure 3). Branched-chain amino acid concentrations in the liver have been reported to be increased in NASH compared to NAFLD [50]. We confirmed an inverse association of the GCKR variant with threonine as previously published [47]. Isoleucine, α-ketobutyrate, and G-glutamyl-threonine are downstream metabolites of threonine [51,52]. Citrulline, a metabolite originating from the urea cycle, is also involved in the G-glutamyl cycle [51], producing G-glutamyl citrulline. The GCKR variant was significantly and inversely associated with citrulline in men with NAFLD. We also found an inverse association of the GCKR variant with 4-guanidinobutanoate, a downstream metabolite of arginine. The inverse associations of the GCKR variant with G-glutamyl citrulline and 4-guanidinobutanoate might reflect poor regulation of the urea cycle in men with NAFLD. PPP1R3B rs4841132-A. The PPP1R3B variant has been associated with an increase in liver glycogen content [13] and a decreased risk of liver steatosis, fibrosis, and HCC [53]. PPP1R3B rs4841132-A. The PPP1R3B variant has been associated with an increase in liver glycogen content [13] and a decreased risk of liver steatosis, fibrosis, and HCC [53]. We report for the first time a detailed metabolic signature of the PPP1R3B variant ( Figure 4). We found a novel inverse association of this variant with hexanoylglutamine and a previously reported positive association with glycine [54] in men both with and without NAFLD, suggesting that the PPP1R3B variant is largely responsible for these associations. In our study, the PPP1R3B variant was associated with decreased concentrations of fasting FFAs. In men with NAFLD, we found novel inverse associations of the PPP1R3B variant with downstream metabolites of tryptophan, N-acetylkynurenine, and xanthurenate, suggesting a decrease in inflammatory reactions [55]. We also found novel positive associations with three steroids, but the clinical significance of this finding remains to be determined. In summary, we identified several novel metabolites for NAFLD, especially associated with the PNPLA3, GCKR, and PPP1R3B variants ( Figure 5). Importantly, each genetic variant had its own metabolite signature, suggesting that multiple metabolic pathways contribute to NAFLD. The PNPLA3 variant leads to accumulation of the PNPLA3 protein on the surface of lipid droplets, causing fat accumulation in the liver. Our novel finding was an increased concentration of 3-UPA in men with NAFLD, which inhibits ATP synthase in mitochondria resulting in the generation of ROS. The GCKR variant increases the NADH/NAD+ ratio, which leads to leaking of electrons from the mitochondria, increasing the generation of ROS, a decrease in gluconeogenesis, and an increase in FA synthesis triggering the generation of ROS and increased TAG levels in the liver. The PPP1R3B variant increases the levels of glycine, a substrate for glutathione (GSH), counteracting ROS generation. MBOAT7 variant increases concentrations of lyso-PI activating hepatic stellate cells triggering fibrosis in the liver. The TM6SF2 variant decreases the assembly of VLDL particles and their export from the liver, leading to low TAG concentrations in the bloodstream and accumulation of TAG in the liver.
The strength of our study is the large size of the METSIM study, allowing us to find novel metabolites associated with genetic variants for NAFLD. In metabolite analyses, we applied a very conservative p value threshold. The limitations of our study are that it was cross-sectional, and only middle-aged and elderly Finnish men were included in the HSD17B13. We found that the HSD17B13 variant was significantly associated with decreased ALT levels, in agreement with a previous study [16]. The HSD17B13 variant was also significantly associated with two glycerophospholipids in men with NAFLD. Our results agree with a previous study demonstrating that phospholipids were enriched in the liver in carriers of the HSD17B13 variant [56].
In summary, we identified several novel metabolites for NAFLD, especially associated with the PNPLA3, GCKR, and PPP1R3B variants ( Figure 5). Importantly, each genetic variant had its own metabolite signature, suggesting that multiple metabolic pathways contribute to NAFLD. The PNPLA3 variant leads to accumulation of the PNPLA3 protein on the surface of lipid droplets, causing fat accumulation in the liver. Our novel finding was an increased concentration of 3-UPA in men with NAFLD, which inhibits ATP synthase in mitochondria resulting in the generation of ROS. The GCKR variant increases the NADH/NAD+ ratio, which leads to leaking of electrons from the mitochondria, increasing the generation of ROS, a decrease in gluconeogenesis, and an increase in FA synthesis triggering the generation of ROS and increased TAG levels in the liver. The PPP1R3B variant increases the levels of glycine, a substrate for glutathione (GSH), counteracting ROS generation. MBOAT7 variant increases concentrations of lyso-PI activating hepatic stellate cells triggering fibrosis in the liver. The TM6SF2 variant decreases the assembly of VLDL particles and their export from the liver, leading to low TAG concentrations in the bloodstream and accumulation of TAG in the liver.
Metabolites 2023, 13,267 12 of 15 evidence that the FLI index is a reliable marker for NAFLD in large population-based studies. In conclusion, we identified several novel metabolites associated with genetic variants for NAFDL. The plasma metabolite signature was unique for each variant, suggesting that several different metabolites and pathways are involved in the pathophysiology of NAFLD. Our study also suggests that the FLI index reliably identifies metabolites for NAFDL in large population-based studies.
Supplementary Materials: The following supporting information can be downloaded at: www.mdpi.com/xxx/s1, Table S1: Associations of PNPLA3 rs738409-G, TM6SF2 rs58542926-T, MBOAT7 rs641738-T, GCKR rs780094-T, PPP1R3B rs4841132-A and HSD17B13 rs72613567:TA with laboratory parameters, Matsuda ISI and body mass index; Table S2: Associations of PNPLA3 rs738409-G, TM6SF2 rs58542926-T, GCKR rs780094-T, PPP1R3B rs4841132-A and HSD17B13 rs72613567:TA with metabolites in participants with NAFLD.  The strength of our study is the large size of the METSIM study, allowing us to find novel metabolites associated with genetic variants for NAFLD. In metabolite analyses, we applied a very conservative p value threshold. The limitations of our study are that it was cross-sectional, and only middle-aged and elderly Finnish men were included in the study. Therefore, we do not know if the results are valid for women, all age groups, or other ethnic and racial groups. We did not directly measure liver fat content in men to confirm a clinical diagnosis of NAFLD. Instead, we applied the FLI index that has been previously validated [20]. In our study, we used an even more stringent criterion, 80, for the upper part of the FLI distribution, compared to 60 in the original publication [20]. We also replicated several previously published metabolites for NAFLD, which provides evidence that the FLI index is a reliable marker for NAFLD in large population-based studies.
In conclusion, we identified several novel metabolites associated with genetic variants for NAFDL. The plasma metabolite signature was unique for each variant, suggesting that several different metabolites and pathways are involved in the pathophysiology of NAFLD. Our study also suggests that the FLI index reliably identifies metabolites for NAFDL in large population-based studies.
Author Contributions: L.F.S. conceived the study, performed statistical analyses, and wrote the manuscript. J.V. performed statistical analyses and revised the manuscript. A.O. and V.M. researched metabolomics data and revised the manuscript. M.L. conceived the study, wrote and reviewed the manuscript, and supervised the entire study and is the guarantor of the study. All authors have read and agreed to the published version of the manuscript. Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.
Data Availability Statement: All datasets generated during the current study can be found within the manuscript. Other datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.

Conflicts of Interest:
The authors declare no conflict of interest.