Molecular Screening via Sanger Sequencing of the Genetic Variants in Non-Alcoholic Fatty Liver Disease Subjects in the Saudi Population: A Hospital-Based Study

Non-alcoholic fatty liver disease (NAFLD) is one of the most common liver diseases, along with steatosis and non-alcoholic steatohepatitis (NASH), and is associated with cirrhosis and hepatocellular carcinoma. Candidate gene and genome-wide association studies have validated the relationships between NAFLD, NASH, PNPLA3, TM6SF2, and HFE. The present study utilized five polymorphisms in three genes: PNPLA3 (I148M and K434E) TM6SF2 (E167K), and HFE (H63D and C282Y), based on undocumented case–control studies in the Saudi Arabian population. A total of 95 patients with NAFLD and 78 non-NAFLD subjects were recruited. Genomic DNA was isolated, and polymerase chain reaction and Sanger sequencing were performed using specific primers for the I148M, K434E, E167K, H63D, and C282Y. NAFLD subjects were older when compared to controls and showed the significant association (p = 0.0001). Non-significant association was found between gender (p = 0.26). However, both weight and BMI were found to be associated. Hardy–Weinberg equilibrium analysis confirmed that H63D, I148M, and K434E polymorphisms were associated. Genotype analysis showed only K434E variant was associated with NAFLD and non-NAFLD (OR-2.16; 95% CI: 1.08–4.31; p = 0.02). However, other polymorphisms performed with NAFLD and NASH were not associated (p > 0.05), and similar analysis was found when ANOVA was performed (p > 0.05). In conclusion, we confirmed that K434E polymorphism showed a positive association in the Saudi population.


Introduction
Non-alcoholic fatty liver disease (NAFLD) is a heterogeneous disorder with multiple metabolic and genetic factors implicated in its pathophysiology, all of which contribute to its progression and development of adverse effects [1]. NAFLD encompasses diseases ranging from simple steatosis to steatohepatitis and is a primary cause of chronic liver damage [2]. In 1986, Schaffner discovered steatosis (triglyceride buildup) within hepatocytes, which progresses to inflammation in non-alcoholic steatohepatitis (NASH). If untreated, it progresses to liver fibrosis, cirrhosis, and possibly hepatocellular carcinoma (HCC) [3,4].

Ethical Statement
The study protocol was approved by the Institutional Review Board of the College of Medicine at King Saud University (E-17-2654). Additionally, signed informed consent was obtained from 173 Saudi Arabian participants involved in this study. All methods were performed in accordance with the relevant guidelines and regulations (Declaration of Helsinki).

Study Design
In this case-control study, we enrolled 95 patients diagnosed with NAFLD and 78 patients without NAFLD, all subjects recruited from the Division of Gastroenterology, King Saud University (KSU). Adult Saudi Arabian patients with obesity, T2DM, insulin resistance, and ultrasound results demonstrating enlarged/fatty liver were included in this study, while patients diagnosed with viral hepatitis, alcoholic hepatitis, drug-induced hepatitis, alpha-1 antitrypsin deficiency, or Wilson's disease were excluded. Patients without NAFLD and without any complications were also enrolled in this study, with the inclusion and exclusion criteria for NAFLD subjects.

Anthropometric Measurements
Anthropometric measurements such as age, sex, height, and weight, which were recorded using standardized techniques, were documented. Body mass index (BMI) was calculated as weight in kilograms divided by height in square meters. BMI was categorized as normal (<24.9 kg/m 2 ), overweight (25.0-29.9 kg/m 2 ), obesity-I (30-34.9 kg/m 2 ), obesity-II (35-39.9 kg/m 2 ), and obesity-III (>40 kg/m 2 ). Additionally, we documented NASH analysis in patients with NAFLD. NASH represents the presence of inflammation and liver damage in addition to fat.

Histological Specimen
A total of 173 liver biopsy specimens were collected based on the histopathological NAFLD assessment score (NAS) from patients with and without NAFLD. The specimens were then fixed in formalin solution, embedded in paraffin blocks, and stained with hematoxylin-eosin and Masson's trichrome. Patients with NAFLD were classified using the NASH Clinical Research Network Classification based on the liver histology data.

Molecular Screening
Genomic DNA from a liver biopsy specimen was extracted using a Qiagen DNA mini-set, as described previously [28]. The concentration and purity of the extracted DNA were measured using a NanoDrop spectrophotometer. Genotyping was performed using polymerase chain reaction (PCR) with a total of 50 µL reaction containing 24.0 µL of ABI master mix, 5.0 µL of 100 ng the genomic DNA, 1.0 µL of both forward and reverse primers, and 19.0 µL of distilled water. Details for rs738409, rs2294918, rs58542926, rs1799945, and rs1800562 SNPs are found in Table 1. The PCR conditions were as follows: initial denaturation at 95 • C for 5 min, followed by 35 cycles of denaturation for 30 min, annealing at 50-60 • C for 45 s, extension at 72 • C for 45 s, and a final extension at 72 • C for 5 min. The amplified product was electrophoresed using a 2% agarose gel stained with ethidium bromide and visualized via UV transillumination.

Sanger Sequencing Analysis
Sanger sequencing analysis was performed based on, purified amplified products were sequence amplified using the BigDye terminator, and then purified again before bidirectional sequencing using the ABI 3730xl Genetic Analyzer. Analysis was performed using the Sequence Analysis Software version 5.4 and SeqScape version 3.

Statistical Analysis
Continuous variables are presented as mean ± standard deviation, whereas categorical variables are presented as percentages and frequencies. The SPSS software (version 23.0) was used for clinical analysis. The Pearson's chi-squared test or Fisher's exact test was used to compare data between the groups. The Pearson's correlation coefficient was used to calculate the relationships between continuous variables. The chi-squared test was used to compare the Hardy-Weinberg equilibrium (HWE) with one degree of freedom. The Openepi software (version 3.01) was used to calculate genotype and allele frequencies. In addition, Yate's correction was applied in this study.

Genotyping of the Five SNPs in Patients with and without NAFLD
Genotype and allele frequencies of the E167K polymorphism in TM6SF2 in the NAFLD and non-NAFLD groups did not show any significant association with any mode of inheritance. The GG genotype frequencies were almost similar in both groups (93.7% vs. 93.6%), while the GA genotypes were varied (6.3% in NAFLD and 5.1% in non-NAFLD). However, in the NAFLD group, the AA genotype was absent, while its frequency was 1. In the NAFLD group, the genotype frequencies of CC, CG, and GG in H63D were 76.8%, 21.1%, and 2.1%, respectively, whereas in non-NAFLD subjects, the frequencies were 69.2%, 26.9%, and 3.9%, respectively. Differences in the proportion of the genetic models were similar with negative associations (CC vs. GG + GC, OR: 0.671, 95% CI: 0.344-1.344, p = 0.260; CC + GG vs. CG, OR: 0.723, 95% CI: 0.358-1.461, p = 0.366; CC + GC vs. GG, OR: 0.537, 95% CI: 0.087-3.301, p = 0.496). The frequency of the G allele in the NAFLD group was 12.7%, which was lower than that in the non-NAFLD group (17.4% (OR: 0.690, 95% CI: 0.380-1.252, p = 0.22). All GG genotypes for the C282Y polymorphism in both cases and controls showed 100% frequency. None of the heterozygous or variants showed any genotype for both groups (GA vs. GG or AA vs. GG, OR: 3.697, 95% CI: 0.148-92.01, p = 0.393). Allele frequency was also negatively associated (A vs. G, OR: 0.781, 95% CI: 0.015-39.79, p = 0.901). Furthermore, a positive or statistical association between the H63D polymorphism in HFE and NAFLD was not observed. Yates correction was applied for both E167K (rs58542926) and C282Y (rs1800562) polymorphisms. Figure 1 shows the chromatograms of the SNPs examined in this study. association, while the dominant (GG vs. GA + AA, OR: 1.057, 95% CI: 0.516-2.162, p = 0.870) and co-dominant models (GG + AA vs. GA, OR: 0.562, 95% CI: 0.306-1.029, p = 0.061) were not associated. The percentages of A and G alleles in the NAFLD group were 56.9% and 43.1%, respectively, whereas in the non-NAFLD groups, the percentages were 48.8% and 51.2%, respectively. A positive association between the allele frequencies and K434E polymorphism was not observed (G vs. A, OR: 1.386, 95% CI: 0.906-2.121, p = 0.131).
Genotype and allele frequencies of the E167K polymorphism in TM6SF2 in the NAFLD and non-NAFLD groups did not show any significant association with any mode of inheritance. The GG genotype frequencies were almost similar in both groups (93.7% vs. 93.6%), while the GA genotypes were varied (6.3% in NAFLD and 5.1% in non-NAFLD). However, in the NAFLD group, the AA genotype was absent, while its frequency was 1.3% in the non-NAFLD group. Dominant (OR: 0.984, 95% CI: 0.288-3.355, p = 0.979) and co-dominant models (OR: 1.202, 95% CI: 0.347-4.157, p = 0.770) showed similar results; however, in the co-dominant and recessive models, Yates' correction was applied. The recessive model (OR: 0.270, 95% CI: 0.010-6.733, p = 0.393) did not show a statistical association between the cases and controls of NAFLD. The A and G alleles of MAF were 3.2% and 96.8% in the NAFLD group, and 3.9% and 96.1% in the non-NAFLD group, respectively. Finally, allele frequency failed to show a significant association (A vs. G, OR: 0.810, 95% CI: 0.257-2.579, p = 0.72).

Clinical Characteristics of Patients with NASH and without NAFLD
Using the Kleiner score, 26.3% (n = 25) of patients with NAFLD were classified as having NASH, while 73.7% (n = 70) were classified as having NAFLD without NASH. The age and sex distributions of NASH and non-NAFLD showed a significant correlation with age (47.42 ± 10.95 vs. 34.9 ± 11.05; p = 0.0003) but not with sex (p = 0.24). The height of patients in both groups showed similar results and were not significantly associated (p = 0.06). The weight and BMI of patients with NASH and those without NAFLD differed significantly (p < 0.05). Table 4 lists the clinical characteristics of both groups.

Genotyping in Patients with NASH and without NAFLD
None of the genotyping analyses of the I148M, K434E, E167K, H63D, and C282Y polymorphisms showed a positive association between the NASH and non-NAFLD groups. Table 5 shows the genotyping, allele frequencies, and genetic modes of inheritance, such as the dominant, co-dominant, and recessive models.

Discussion
In this hospital-based case-control study, we have shown that the K434E polymorphism is associated with NAFLD and non-NAFLD in the Saudi Arabian population. None of the variants, including K434E, showed positive association with NAFLD or NASH. Additionally, statistical analysis showed a positive association only with obesity and NAFLD (p = 0.001).
NAFLD is defined as having fat accumulation in the liver or observing hepatic steatosis via imaging or liver histology when other sources of fat build-up in the liver have been ruled out. Histological examination is essential for the diagnosis of NAFLD [29]. NAFLD and NASH can be confirmed with a liver biopsy. NASH is an advanced stage of NAFLD, a common comorbidity of obesity and T2DM [30]. The prevalence of obesity and T2DM in the Saudi Arabian population is high [31,32], and obesity, T2DM, and NAFLD-NASH are clinically and pathophysiologically connected. Local studies in the Saudi Arabian population have documented various prevalence frequencies of NAFLD. Females are more affected by chronic liver disease than males, which may be due to the expression of sex hormones, which is projected to diminish after menopause [27].
Unfortunately, no medications are allowed for NAFLD treatment; nevertheless, lifestyle changes, including diet and exercise, are effective in managing it [33]. Genes affecting hepatic fat storage, mobilization, and development of NAFLD as variations of transcription factors that control lipid metabolism in the liver and adipose tissue are thus viable candidates for treatment [34]. The major emphasis of investigations has been to identify associations between advanced disease stages and selected SNPs in genes encoding different proteins implicated in disease pathology. Candidate gene association studies are commonly used to examine disease-causing genes in human diseases, and the frequency of candidate genes in one or more known SNPs in patients and controls is evaluated in the quest for a statistical association with NAFLD [35].
Although PNPLA3 and TM6SF2 appear to be the most prominent hepatic steatosis determinants across the population, additional genetic deficiencies, which have been relatively infrequent or less significant, have been shown to produce fatty liver metabolism. Genes that control hepatic treatment and VLDL secretion mutations are involved in familial causes of NAFLD [36], Romeo et al. [23] reported that NAFLD is associated with the rs738409 polymorphism in PNPLA3. The link between PNPLA3 and liver histology was validated in patients with NAFLD using GWAS. It was encoded by an isoleucine to methionine substitution variation at protein position 148 (I148M). The I148M polymorphism has been linked to increased hepatic fat accumulation in Europeans regardless of body weight. In a cohort study on the Finnish population, I148M increased the risk of hepatic steatosis [37]. PNPLA3 harbors both triacylglycerol lipase and acylglycerol O-acyltransferase activity, as well as retinyl ester activity in lipid-stellate cells [38]. The interaction between rs2294918 and PNPLA3 mRNA was upregulated, and the protein may be associated with direct effects of PNPLA3 mRNA regulation or PNPLA3 locus methylation on mRNA stability or linkages with other noncoding variants. In 434E allele carriers, PNPLA3 was upregulated [39]. In 2014, Kozlitina et al. [24] validated the relationship between hepatic steatosis and PNPLA3 SNPs and identified polymorphisms in the hepatic triglyceride content gene of TM6SF2. PNPLA3 polymorphisms have been interlinked since 2008 with the risk and severity of NAFLD. Variants of TM6SF2 were also involved in these results [40].
The C-T rs58542926 variant in the TM6SF2 locus codes for an E to K substitution at position 167, resulting in loss of function, is associated with lower hepatic TM6SF2 mRNA and protein expression. In other tissues, TM6SF2 is mostly expressed in the liver and small intestine [41,42]. Giovanni et al. showed that TM6SF2 rs58542926 can impact the nutrient oxidation, glucose homeostasis, and postprandial lipoprotein of adipokines in patients with NAFLD [43]. Although TM6SF2 does not have a specific function, it affects cholesterol synthesis and lipoprotein secretion [38].
In 1996, Feder et al. initially identified HFE on the petite arm of chromosome 6 at 6p21.3, encompassing a 343-amino acid long glycoprotein [44]. Excess iron absorption in the liver hastens the progression of NAFLD to NASH owing to oxidative stress. Iron and heme catalyze oxidation processes caused by reactive oxygen species emitted during Fenton reactions, contributing to oxidative stress [45]. HFE has many genetic variants, including two missense mutations: an amino acid replacement from cysteine to tyrosine (C282Y) and a histidine to aspartate substitution (H63D) [46].
Previous studies have reported an association of I148M and K434E polymorphisms in PNPLA3 with NAFLD in the global population [47][48][49][50][51][52][53]. However, our study was not associated with the I148M polymorphism, but was associated with the K434E variant in PNPLA3. Our study is in agreement with previous studies [39,54]. Additionally, limited studies have been performed on the meta-analysis of I148M and K434E polymorphisms in PNPLA3 [55][56][57]. Furthermore, in our study, the E167K polymorphism was not associated with NAFLD or NASH. However, previous studies have reported positive and negative associations between NAFLD and NASH [24,47,[58][59][60][61]. A meta-analysis study on the E167K variant in preventing CAD and conferring risk for NAFLD revealed that the rs58542926 polymorphism is a key regulator of blood lipid characteristics in global studies [62]. Metaanalysis studies have also shown the E167K (rs58542926) polymorphism in TM6SF2 in NAFLD and other human diseases, such as carcinoma and liver fibrosis [42,[62][63][64]. For the H63D polymorphism, 21% of heterozygotes and 2.1% of homozygous variants were present in NAFLD cases in the present study. None of the genotypes was heterozygous or homozygous for variants of the C282Y polymorphism, and no statistical association between the H63D and C282Y polymorphisms in NAFLD was observed. Previous studies showed both associations in NAFLD subjects [65][66][67]. Our study was not in agreement with the documented studies with positive association may be due to the lack of high sample size, or ethnicity playing a major role. The major limitation of our study is the small sample size. We recruited only 95 patients with NAFLD and 78 patients without NAFLD.
Nevertheless, recruiting native Saudi Arabian patients and direct sequencing were the strengths of our study.

Conclusions
In conclusion, we confirmed that K434E polymorphism showed a positive association in the Saudi Arabian population. Further study on the multiple genetic variants associated with NAFLD using a larger sample size is recommended.  Institutional Review Board Statement: The study protocol was approved by the Institutional Review Board of the College of Medicine at King Saud University (E-17-2654). Additionally, signed informed consent was obtained from 173 Saudi Arabian participants involved in this study. All methods were performed in accordance with the relevant guidelines and regulations (Declaration of Helsinki).

Data Availability Statement:
The data presented in this study are available on request from the corresponding author. The data are not publicly available due to privacy.

Conflicts of Interest:
All authors declare that there are no conflict of interest.