Whole-Exome Sequencing (WES) Reveals Novel Sex-Specific Gene Variants in Non-Alcoholic Steatohepatitis (MASH)

Non-alcoholic steatohepatitis (NASH, also known as MASH) is a severe form of non-alcoholic fatty liver disease (NAFLD, also known as MASLD). Emerging data indicate that the progression of the disease to MASH is higher in postmenopausal women and that genetic susceptibility increases the risk of MASH-related cirrhosis. This study aimed to investigate the association between genetic polymorphisms in MASH and sexual dimorphism. We applied whole-exome sequencing (WES) to identify gene variants in 8 age-adjusted matched pairs of livers from both male and female patients. Sequencing alignment, variant calling, and annotation were performed using standard methods. Polymerase chain reaction (PCR) coupled with Sanger sequencing and immunoblot analysis were used to validate specific gene variants. cBioPortal and Gene Set Enrichment Analysis (GSEA) were used for actionable target analysis. We identified 148,881 gene variants, representing 57,121 and 50,150 variants in the female and male cohorts, respectively, of which 251 were highly significant and MASH sex-specific (p < 0.0286). Polymorphisms in CAPN14, SLC37A3, BAZ1A, SRP54, MYH11, ABCC1, and RNFT1 were highly expressed in male liver samples. In female samples, Polymorphisms in RGSL1, SLC17A2, HFE, NLRC5, ACTN4, SBF1, and ALPK2 were identified. A heterozygous variant 1151G>T located on 18q21.32 for ALPK2 (rs3809983) was validated by Sanger sequencing and expressed only in female samples. Immunoblot analysis confirmed that the protein level of β-catenin in female samples was 2-fold higher than normal, whereas ALPK2 expression was 0.5-fold lower than normal. No changes in the protein levels of either ALPK2 or β-catenin were observed in male samples. Our study suggests that the perturbation of canonical Wnt/β-catenin signaling observed in postmenopausal women with MASH could be the result of polymorphisms in ALPK2.


Introduction
Non-alcoholic fatty liver disease (NAFLD) (also known as metabolic dysfunctionassociated fatty liver disease, MAFLD) is the leading cause of chronic liver disease, affecting 25% of the US population [1,2].It is commonly associated with obesity, diabetes, and metabolic syndrome but can also affect non-obese individuals.The disease spectrum ranges from bland steatosis with or without inflammation (non-alcoholic fatty liver, NAFL) to steatosis with inflammation and hepatocellular injury (non-alcoholic steatohepatitis, NASH) (also known as metabolic dysfunction-associated steatotic liver disease, MASH), fibrosis, cirrhosis, and hepatocellular carcinoma [3].Owing to the lack of reliable noninvasive predictive biomarkers, the diagnosis of MASH is mainly limited to the histopathological evaluation of liver samples defined by liver-biopsy-proven hepatocellular steatosis, lobular inflammation, and evidence of hepatocyte injury such as ballooning degeneration [4].A large body of evidence strongly supports the idea that MASLD susceptibility and progression to MASH are sex specific.Several studies conducted in single centers or in specific populations have suggested that women have a 19% lower risk of MASLD than men in Genes 2024, 15, 357 2 of 20 the general population.However, once MASLD has become established, women have a 37% higher risk of advanced fibrosis than men [5].Among individuals with established MASLD who are older than 50 years, women have a 17% greater risk for MASH and a 56% greater risk for advanced fibrosis than men [5,6].Although it has been established that the prevalence of risk factors such as age, obesity, type 2 diabetes mellitus (T2DM), atherogenic dyslipidemia, and clinical outcomes of MASLD differs between sexes, the molecular mechanisms by which sex modulates the pathogenesis and clinical outcomes of MASLD progression are poorly defined.Therefore, to understand the potential mechanisms underlying this sexual dimorphism in MASLD prevalence, we recently used a multiomics approach with archived liver samples from both sexes to study the biological basis of the observed sexual dimorphism.Our study suggests (for the first time) that the activation of canonical Wnt signaling could be one of the main pathways associated with sexual dimorphism in MASLD and MASH [7].
Two different Wnt signaling pathways, canonical and non-canonical, have their own influence on MASLD and MASH.The non-canonical pathway is involved in the accumulation of fat, inflammation, and lipids, which promote MASH formation.The canonical pathway involving β-catenin functions as an anti-inflammatory, anti-lipid accretion, and adipocyte differentiation pathway [8].Hence, the inhibition or downregulation of the classical Wnt/β-catenin pathway contributes to the onset and progression of MASLD.For example, MASLD is inhibited by the upregulation of peroxisome proliferator activated receptor γ (PPAR-γ), a downstream target of the Wnt/β-catenin signaling that promotes preadipocyte differentiation, adipogenesis, the absorption of free fatty acids (FFA), and the suppression of inflammation [9].Polymorphisms in low-density lipoprotein receptorrelated protein-6 (LRP6) are a major cause of MASLD [10].Although it is well documented that MASLD progression is attributed to dynamic interactions between genetic and environmental factors [11], there is still limited information on how canonical Wnt/β-catenin signaling is involved in MASLD/MASH disease progression.Therefore, we hypothesized that gene variants in the Wnt/β-catenin signaling pathway could be associated with the observed sexual dimorphism in MASH, as suggested by our recent study [7].
To test this hypothesis, we used whole-exome sequencing (WES) to identify potential gene variants implicated in MASH using 16 archived frozen liver samples from paired males and females.Here, we report the identification of α protein kinase 2 (ALPK2) gene variants (rs3809981and rs3809983) as female-specific single-nucleotide polymorphisms (SNPs) in the postmenopausal livers of women with MASH.

Ethics Statement
The Institutional Review Board (IRB) of Washington State University (WSU) approved the protocol of the current study.Sixteen paired matched snap-frozen tissue samples were obtained from the IRB-approved University of Minnesota Liver Tissue Cell Distribution System (LTCDS).All specimens with anonymized identifiers were histopathologically confirmed by a pathologist (Table S1; Supplemental Digital Content).

DNA Extraction and Whole-Exome Sequencing (WES-Seq)
Genomic DNA was extracted from 16 frozen liver tissue samples (4 matched pairs of both sexes) using a Wizard Genomic DNA purification kit (A1120, Promega, Madison, WI, USA) following the manufacturer's instructions.The DNA concentration was measured using a NanoDrop spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA).The extracted DNA (50 ng/uL/sample) was shipped to LC Sciences (Houston, TX, USA) for exome sequencing (100× coverage).Two hundred nanograms of genomic DNA (200 ng) from each subject's MASH-normal paired samples, which were fragmented by sonication, were subjected to library preparation using the Agilent SureSelect Human All Exon V6 kit (Agilent Technologies, Santa Clara, CA, USA) following the vendor's recommended protocol.DNA libraries were hybridized and captured using SureSelect.Following hybridization, the captured libraries were purified according to the manufacturer's instructions and amplified by polymerase chain reaction (PCR).Normalized libraries were pooled, and DNA was subjected to paired-end sequencing using the Illumina HiSeq X Ten platform with a 150-bp paired-end sequencing mode.

WES Data Processing
Raw sequence reads were trimmed to remove low-quality sequences and then aligned to the human reference genome (hg19) using the Burrows-Wheeler alignment tool [12].Singlenucleotide polymorphisms and small insertions/deletions were identified in individual samples using the Genome Analysis Toolkit (GATK Mutect2 4.0.4.0) with the default setting [13].ANNOVAR was then used to annotate the VCF files using the gene region and several filters from other databases [14].Finally, we used the Database for Annotation, Visualization, and Integrated Discovery (DAVID) Bioinformatics Resource 6.7 (https://David-d.ncifcrf.gov,accessed on 22 July 2023) and Gene Set Enrichment Analysis (GESA) [15] to identify significantly altered biological processes and pathways in 16 liver tissue samples.

PCR and Sanger Sequencing
To validate the ALPK2 polymorphisms, we used PCR and Sanger sequencing from Azenta Life Sciences (Burlington, MA, USA).Specific PCR primers for ALPK2, F: TGCTGTC-TATCAAATCTCGGCT and R: GAGCACTCAACCTCAACGGA were used.Primers were designed using Primer3 (http://bioinfo.ut.ee/primer3-0.4.0/, accessed on 22 July 2023).The products were directly sequenced using the ABI PRISM BigDye Kit on an ABI 3130 DNA sequencer (Applied Biosystems, Foster City, CA, USA).Sequencing results were analyzed using A Plasmid Editor [16].

Western Blot Analysis
Frozen liver tissue samples (n = 12) were homogenized in ice-cold lysis buffer containing a protease/phosphatase inhibitor cocktail and centrifuged at 12,000× g at 4 • C for 15 min.Protein samples were separated by sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) and transferred onto polyvinylidene difluoride (PVDF) membranes.After blocking in 5% non-fat milk at 37 • C for 1 h, the membranes were incubated overnight at 4 • C with primary antibodies against ALPK2 (ab111909, Abcam, Cambridge, UK), β-catenin (8480S, Cell Signaling Technology, Danvers, MA, USA), or GAPDH (sc-47724, Santa Cruz Biotechnology, Dallas, TX, USA).Following incubation with the secondary antibody, immunoreactive proteins were visualized using the ChemiDoc Touch Imaging System (Bio-Rad).Protein bands were quantified using the ImageJ 1.53k.

Statistical Analysis of Western Blot
The data were expressed as the mean ± SEM (n = 3/phenotype/sex) and Student's t-test was used to analyze statistical significance.Statistical analyses were performed, and graphs were generated using GraphPad Prism 6 (GraphPad Software Inc., San Diego, CA, USA).** p < 0.01 was considered statistically significant.

Clinical Characteristics of the Study Population
Sixteen snap-frozen liver tissue samples (normal and MASH) from white non-Hispanic populations of both sexes were used in this study.The median age (range) of patients was 54 to 59 years old.In general, the clinicopathological characteristics of patients with MASH (steatosis, steatohepatitis, ballooning, and portal inflammation) were higher in women than in men.Detailed clinicopathological information is summarized in Supplemental Table S1.

WES, Data Filtering and Mutation Landscape of Liver Tissue Samples
As shown in Figure 1, using the WES approach we identified 148,881 gene variants in 16 liver tissue samples, representing 57,121 and 50,150 gene variants in female and male cohorts, respectively.For SNVs, 35,000 (27%) were exonic and 79,259 (59%) were intronic (Table 1).For InDels, 13,925 were identified and 10,837 (78%) were intronic, as shown in Table 1.Our analysis detected no differences in SNPs, InDel distribution, or mutation type between sexes (Supplemental Figures S1 and S2).By contrast, FACETS analysis [17] revealed that copy number variants (CNVs) in female cohorts differed from those in male cohorts.As shown in Figure 2A, many gene variants (female cases), such as SLC17A2 (Table 2), were clustered around chromosome 6 (as represented by allele-specific log-odd-ratio data), whereas in male cases (Figure 2B), many gene variants such as CAPN14 (Table 3) were clustered around chromosome 11.Collectively, these observations suggest that copy-number alterations (CNAs) of these genes are different in the two cohorts and could play an important role in the sexual dimorphism of MASH.
in women than in men.Detailed clinicopathological information is summarized in Supplemental Table S1.

WES, Data Filtering and Mutation Landscape of Liver Tissue Samples
As shown in Figure 1, using the WES approach we identified 148,881 gene variants in 16 liver tissue samples, representing 57,121 and 50,150 gene variants in female and male cohorts, respectively.For SNVs, 35,000 (27%) were exonic and 79,259 (59%) were intronic (Table 1).For InDels, 13,925 were identified and 10,837 (78%) were intronic, as shown in Table 1.Our analysis detected no differences in SNPs, InDel distribution, or mutation type between sexes (Supplemental Figures S1 and S2).By contrast, FACETS analysis [17] revealed that copy number variants (CNVs) in female cohorts differed from those in male cohorts.As shown in Figure 2A, many gene variants (female cases), such as SLC17A2 (Table 2), were clustered around chromosome 6 (as represented by allele-specific log-oddratio data), whereas in male cases (Figure 2B), many gene variants such as CAPN14 (Table 3) were clustered around chromosome 11.Collectively, these observations suggest that copy-number alterations (CNAs) of these genes are different in the two cohorts and could play an important role in the sexual dimorphism of MASH.

WES Identifies ALPK2 Variant in Female Cases
To further analyze our gene variant data, statistical significance was first determined via a hypothetical Fisher's exact test (Figure 1); with four male samples vs. four females, a polymorphism was considered significant if it existed in all female samples but none of the male samples (or if a polymorphism existed in all male samples but none of the female samples).The corresponding p-value for this assumption was 0.0286.We merged and filtered the vcf files of individual samples and searched for polymorphisms that met the above criteria.Polymorphisms that passed the criteria were then annotated with the Var2GO tool [18], using GRCh37 as a reference, given that the original analysis was performed using the hg19 genome.A total of 251 highly significant sex-specific MASH gene variants (p < 0.0286) were identified.A total of 63 MASH female-specific gene variants were identified, as shown in Table 2, whereas 54 gene variants were identified in males (Table 3).Among the 54 male variants, we found polymorphisms in CPN14 (12 intronic variants), SRP54 (four intronic and upstream gene variants), ABCC1 (three synonymous and intronic gene variants), RNFT1 (two upstream gene variants), SLC37A3 (two intronic and upstream gene variants), obg-like ATPase 1 (OLA1) (two intronic and non-synonymous variants), BAZ1A (two intronic and downstream gene variants), and MYH11 (two intronic and synonymous variants).

Validation of the ALPK2 Variant
To identify the biological pathways associated with ALPK2, we performed gene set enrichment analysis (GSEA) [15] using a TCGA liver cancer patient cohort from the cBioPortal database.As shown in Table 4, Wnt gene signatures, including canonical/β-cateninmediated pathways, were negatively enriched in ALPK2-high (FDR q-val = 0.036 to 0.003) vs. ALPK2-low samples (FDR q-val = 0.105 to 0.544), which is consistent with a previous report showing ALPK2 as a negative regulator of canonical Wnt signaling [19].These data also confirmed that ALPK2 is associated with β-catenin-mediated pathways in women with MASH, as we previously reported [7].Next, we validated the ALPK2 mutation by PCR testing coupled with Sanger sequencing.As shown in Figure 3, the normal, healthy sample HH1202 was used as a reference for comparison with the two female MASH samples (UMN1535 and UMN1259).A clear single nucleotide polymorphism (SNP) is highlighted with a black box in the MASH samples in Figure 3A,B.The identified SNP (p.Ala1551Ser) resulted in nsSNV (rs3809983), as shown in Table 2.
Since ALPK2 was shown to be involved in the canonical Wnt/β-catenin signaling pathway (Table 4), we measured the protein expression of both ALPK2 and β-catenin in both male and female liver tissue samples using immunoblot analysis.As shown in Figure 4A,B, the protein expression of β-catenin in female samples was 2-fold higher than that in normal samples, whereas ALPK2 expression was 0.5-fold lower than that in normal samples.No change in the expression of either ALPK2 or β-catenin was observed in male samples (Figure 4C,D).
Although sex differences exist in the prevalence, risk factors, fibrosis, and clinical outcomes of MASLD/MASH, our understanding of the genetic basis of sexual dimorphism remains limited.Therefore, in this study, we performed WES analyses of paired-matched liver tissue samples from male and female MASH patients (Table S1) to elucidate sexspecific gene variants associated with this disease.As shown in Figure 1, we identified 63 gene variants that were specific to the female and 54 male-specific variants (Fisher's exact test p < 0.0286).Interestingly, a significant number of these gene variants have been identified with respect to the sexual dimorphism of MASLD/MASH, whereas others have been previously reported to be involved in the pathogenesis of the disease.For example, in male-specific variants (Table 3), we identified CAPN14 as encoding a calcium-regulated non-lysosomal thiol-protease (Calpain) as a top gene variant that is known to be involved in a variety of cellular processes including apoptosis, cell division, the modulation of integrin-cytoskeletal interactions, and synaptic plasticity [24].Recently, calpains have been shown to be associated with hepatocyte death in MASH and the progression of hepatocellular carcinoma (HCC) [25,26].Regarding chr2 (2p23.1),we found that OLA1 encodes a member of the GTPase protein family.It interacts with breast-cancer-associated gene 1 (BRCA1) and BRCA1-associated RING domain protein (BRAD1) and is involved in centrosome regulation [27].OLA1 has been shown to be associated with hereditary breast and ovarian cancers as well as with a poor prognosis of HCC [28,29].Polymorphisms were also found in other canonical cancer-related genes, including SLC37A3, BAZ1A, SRP54, MYH1 and ABCC1 [30][31][32][33], but were not directly involved in MASLD pathogenesis.As shown in Table 3, we also identified a SNP (synonymous variant) in ORAI1 (ORAI calcium release-activated calcium modulator 1), which encodes a membrane calcium channel subunit activated by the calcium sensor STIM1 when calcium stores are depleted [34].ORAI polymorphisms have been shown to be associated with non-canonical Wnt signaling, MASLD progression, and HCC [35,36].
For female-specific gene variants (Table 2), we identified six loci of SLC17A2 on chr6 (6p22.2),encoding proteins belonging to sodium-dependent phosphate transporters.A recent study reported that SLC17A2 variants were associated with MASLD in lean individuals [37].In the present study, SLC17A2 was specifically identified in female MASH patients.For the same chr6 (6p22.2),we also established that HFE encodes a transmembrane protein that regulates iron absorption by regulating the interaction of the transferrin receptor with transferrin associated with MASLD in lean individuals along with SLC17A2 (37).For chr16 (16q13), we identified two loci NLRC5 that encode members of the caspase recruitment domain of the NLR family.This gene plays a major role in the regulation of the NF-kappa B and interferon signaling pathways [38].Polymorphisms in NLRC5 are associated with obesity, type 2 diabetes mellitus (T2DM), and MASLD [39] and limit the NF-kB signaling pathway [40].
In the present study, we identified rs3809983 ALPK2 as a novel gene variant associated with MASH in female liver samples.ALPK2 mapped to 18q21.32 encodes a serine/threonine kinase protein that is involved in several processes, including epicardium morphogenesis and heart development, and is a negative regulator of Wnt signaling [19].Recent studies by McIntosh et al. [41] showed that ALPK2 rs3809973 (not ALPK2 rs3809983, identified in this study) is associated with an increased risk of liver fibrosis in HIV/HCV co-infected women.This may be the initial indication linking the ALPK2 variant to the pathological liver phenotype in women.Furthermore, Lawrence et al. [42] found that ALPK2 is a novel polymorphic gene in human cancers in a large-scale genomic analysis of 4742 human neoplasms and their matched normal tissue samples.In mouse xenograft models, the knockdown of ALPK2 inhibits the development and progression of ovarian cancer [43] and renal cancer cells [44], thus supporting its relevance not only in cancer initiation and development but also in the pathogenesis of liver disease.
To validate the ALPK2 polymorphism, we used PCR coupled with Sanger sequencing and found that ALPK2 rs3809983 was associated with MASH in the female patient samples (Figure 3).This association was further confirmed by immunoblot analysis (Figure 4), suggesting that the ALPK2 polymorphism was linked to defective canonical Wnt signal transduction only in female samples.ALPK2 polymorphisms cause inappropriate levels of β-catenin and thus a perturbation of the Wnt signaling pathway in female patients with MASH, as we previously reported [7].These observations thus agree with the cBio-Portal analysis (Table 4), suggesting a good correlation between ALPK2 loss/decreased function and the loss of its negative regulatory activity in the canonical Wnt/β-catenin signaling pathway.
Despite the important findings of this study, it has some limitations.These limitations are primarily associated with the availability of paired matched MASLD/MASH liver samples from the male and female cohorts.Although the present study was limited by the relatively small number of available samples, the data presented here showed a clear and robust distinction between female and male patients with respect to gene variants associated with MASH livers compared with normal livers.We hypothesize that future efforts should be made to increase the sample size while improving the selection of extreme phenotypes to maximize the power of this strategy.Demographic variables such as ethnic background should be considered in future studies.Owing to sample availability, the individuals included in our study were mainly of Caucasian origin, which may limit the applicability of our findings to other ethnic populations.These limitations highlight the critical need to improve research in this area, especially in clinically relevant conditions associated with MASLD and MASH such as inter-hepatic cholangiocarcinoma and celiac disease [45,46].Further studies are also needed to elucidate the cellular and molecular basis on how ALPK2 variants may impact the sexual dimorphism of MASLD/MASH disease progression.
In summary, this study provides evidence that MASLD-related sexual dimorphism is influenced by genetic variants.We used WES of the liver tissue samples to identify sexspecific gene polymorphisms associated with MASH.Our study further provides evidence that polymorphisms in ALPK2 are associated with postmenopausal women compared to men and that the activation of the canonical Wnt signaling pathway previously reported [7] could be the result of ALPK2 polymorphisms.Other (downstream) members of the Wnt signaling pathway could also be associated with MASH severity in postmenopausal women compared to men.

Figure 1 .
Figure 1.Illustration of WES workflow from frozen liver tissue samples of male and female patients to MASH sex-specific gene variants.Pipeline of bioinformatics analysis adapted in the WES results of gene variants.

Figure 2 .
Figure2.A representative integrated visualization of FACETS analysis of WES data for (A) female and (B) male total copy number variants (CNVs).The top panel displays total copy number log-ratio (logR), and the second panel displays allele-specific log-odds-ratio data (logOR) with chromosomes alternating in blue and gray.The third panel plots the corresponding integer (total, minor) copy number calls.The overall ploidy and purity for female patients in this case are 2.03 and 0.65, respectively, and 2.05 and 0.63 for male patients.The estimated cellular fraction (cf) profile is plotted at the bottom, revealing the aggregate of variants at each chromosome.

Figure 3 .
Figure 3. Representative of Sanger sequence alignment (A) and chromatograms (B) of ALPK2 in normal and MASH female livers.Sequencing alignment was performed using a plasmid editor.A normal representative liver sample with no SNPs (HH1202) was used as a reference for comparison against two MASH-related samples (UMN1535 and UMN1259).A clear SNP is highlighted with a black box.The SNP leads to a substitution mutation from a hydrophobic alanine (A) at the 1151 position to a polar serine (S).No SNPs were observed in MASH-related samples of male patients.

Figure 4 .
Figure 4. Immunoblot analysis of ALPK2 and β-catenin in female (A,B) and male (C,D) liver tissue samples.ALPK2 and β-catenin protein band intensity results were normalized to GAPDH and quantitatively analyzed with ImageJ 1.53k..The ratio of target protein to GAPDH in individual normal groups was set as 1.Data represent the mean ± SEM. ** p < 0.01; ns, not significant; n = 3 samples/phenotype/sex.

BFigure 3 . 18 Figure 3 .
Figure 3. Representative of Sanger sequence alignment (A) and chromatograms (B) of ALPK2 in normal and MASH female livers.Sequencing alignment was performed using a plasmid editor.A normal representative liver sample with no SNPs (HH1202) was used as a reference for comparison against two MASH-related samples (UMN1535 and UMN1259).A clear SNP is highlighted with a black box.The SNP leads to a substitution mutation from a hydrophobic alanine (A) at the 1151 position to a polar serine (S).No SNPs were observed in MASH-related samples of male patients.

Figure 4 .
Figure 4. Immunoblot analysis of ALPK2 and β-catenin in female (A,B) and male (C,D) liver tissue samples.ALPK2 and β-catenin protein band intensity results were normalized to GAPDH and quantitatively analyzed with ImageJ 1.53k..The ratio of target protein to GAPDH in individual normal groups was set as 1.Data represent the mean ± SEM. ** p < 0.01; ns, not significant; n = 3 samples/phenotype/sex.

BFigure 4 .
Figure 4. Immunoblot analysis of ALPK2 and β-catenin in female (A,B) and male (C,D) liver tissue samples.ALPK2 and β-catenin protein band intensity results were normalized to GAPDH and quantitatively analyzed with ImageJ 1.53k..The ratio of target protein to GAPDH in individual normal groups was set as 1.Data represent the mean ± SEM. ** p < 0.01; ns, not significant; n = 3 samples/phenotype/sex.

63 54 134 M F Figure 1. Illustration of WES workflow from frozen liver tissue samples of
male and female patients to MASH sex-specific gene variants.Pipeline of bioinformatics analysis adapted in the WES results of gene variants.

Table 1 .
Statistics of somatic SNV and InDels in position.

Table 2 .
Female common uniquely significant annotated variants.

Table 3 .
Male common uniquely significant annotated variants.