Mutations in Epigenetic Regulation Genes in Gastric Cancer

Simple Summary Epigenetic mechanisms, such as DNA methylation/demethylation, covalent modifications of histone proteins, and chromatin remodeling, create specific patterns of gene expression. Epigenetic deregulations are associated with oncogenesis, relapse of the disease and metastases, and can serve as a useful clinical marker. We assessed the clinical relevance of integrity of the genes coding for epigenetic regulator proteins by mutational profiling of 25 genes in 135 gastric cancer (GC) samples. Overall, mutations in the epigenetic regulation genes were found to be significantly associated with reduced overall survival of patients in the group with metastases and in the group with tumors with signet ring cells. We have also discovered mutual exclusivity of somatic mutations in the KMT2D, KMT2C, ARID1A, and CHD7 genes in our cohort. Our results suggest that mutations in epigenetic regulation genes may be valuable clinical markers and deserve further exploration in independent cohorts. Abstract We have performed mutational profiling of 25 genes involved in epigenetic processes on 135 gastric cancer (GC) samples. In total, we identified 79 somatic mutations in 49/135 (36%) samples. The minority (n = 8) of mutations was identified in DNA methylation/demethylation genes, while the majority (n = 41), in histone modifier genes, among which mutations were most commonly found in KMT2D and KMT2C. Somatic mutations in KMT2D, KMT2C, ARID1A and CHD7 were mutually exclusive (p = 0.038). Mutations in ARID1A were associated with distant metastases (p = 0.03). The overall survival of patients in the group with metastases and in the group with tumors with signet ring cells was significantly reduced in the presence of mutations in epigenetic regulation genes (p = 0.036 and p = 0.041, respectively). Separately, somatic mutations in chromatin remodeling genes correlate with low survival rate of patients without distant metastasis (p = 0.045) and in the presence of signet ring cells (p = 0.0014). Our results suggest that mutations in epigenetic regulation genes may be valuable clinical markers and deserve further exploration in independent cohorts.


Introduction
Gastric cancer (GC) is the 5th most common tumor in the world, and is the 3rd leading cause of cancer-related deaths worldwide. In 2018, more than 1,000,000 new GC patients were identified [1].
Recently, knowledge about the molecular mechanisms of gastric carcinogenesis has been intensively expanded. By using genome-wide approaches, The Cancer Genome Atlas (TCGA) Research Network divided GC into four molecular subtypes: Epstein-Barr associated (EBV), microsatellite instability (MSI), genomically stable (GS), and chromosomal The study included 135 patients with locally advanced GC who were treated in N.N. Burdenko Faculty Surgery Clinic, I.M. Sechenov First Moscow State Medical University from 2007 to 2015. The study was conducted in accordance with the Declaration of Helsinki and was approved by the Institutional Ethics Committee of I.M. Sechenov First Moscow State Medical University. Written informed consent was obtained from each participant in this study. All patients underwent surgical treatment, and resected tumor samples, as well as non-malignant gastric mucosa samples, were used in the study. GC was confirmed in all patients by morphological examination of the surgical material. For TNM staging, ESMO Clinical Practice Guidelines for diagnosis, treatment, and follow-up for gastric cancer [6] were used. The distribution of patients in clinical groups is presented in Table 2.

Mutation Screening by NGS
A total of 5 to 7, 10 µm paraffin sections were manually dissected to ensure that each sample contained at least 70% of neoplastic cells. Genomic DNA was isolated from archived samples using a QIAamp DNA FFPE Tissue Kit (Qiagen, Hilden, Germany), as recommended by the manufacturer.
Deep sequencing was performed using the Ion Torrent platform (ThermoFisher, Waltham, MA, USA) following established protocol [7]. The protocol includes the preparation of libraries of genomic DNA fragments, clonal emulsion PCR, sequencing, and bioinformatic analysis of obtained results. DNA fragment libraries were prepared using Ion Ampliseq ultra-multiplex PCR technology.
An epigenetic regulation genes panel with 1376 primer pairs was designed to amplify all coding regions, noncoding regions of the terminal exons, and putative splice site gene regions for 25 human genes: DNMT1, MBD1, TET1, DNMT3A, DNMT3B, EZH2, KDM6A, EP300, JARID1B, CREBBP, HDAC2, SIRT1, SMARCB1, SMARCA2, SMARCA4, ARID1A, ARID2, BRD7, PBRM1, CHD5, CHD7, CHD4, KMT2A, KMT2D and KMT2C. The panel was designed by using the Ion Ampliseq Designer v. 7.03 (ThermoFisher, Waltham, MA, USA). The total length of human genome sequences covered by the panel was 250,900 bp. The panel reached 98.09% coverage by design; this applies to exons and 25 bp flanking intron sequences. The information of the panel is shown in Tables S3 and S4. The selection of epigenetic regulation genes for the panel was based on the estimation of the frequency of their somatic mutations in GC, obtained from the COSMIC database and from the literature. Genes reported to be mutated in >3.5% of GC samples were included in the panel.
Multiplex PCR and subsequent stages of the fragment library preparation were performed using an Ion AmpliSeq Library Kit 2.0 (ThermoFisher, Waltham, MA, USA), according to the manufacturer's protocol. Aliquots from the prepared libraries were subjected to clonal amplification on microspheres in the emulsion on the Ion Chef Instrument (ThermoFisher, Waltham, MA, USA). Sequencing was performed on the Ion S5 genomic sequencer according to the manufacturer's protocol (ThermoFisher, Waltham, MA, USA) with the targeted sequencing depth of 1000×. The results were analyzed with Torrent Suite software consisting of Base Caller (the primary analysis of the sequencing results); Torrent Mapping Alignment Program-TMAP (alignment of the sequences to the reference genome GRCh37/hg19); and Torrent Variant Caller (analysis of variations in nucleotide sequences) with the cut-off for variant allele frequency set at 0.1, and minimum read depth of the variant allele set at 5. Genetic variants were annotated with ANNOVAR software [8]. Visual data analysis, manual filtering of sequencing artifacts, and sequence alignment were performed using the Integrative Genomics Viewer (IGV) [9].

Sanger Sequencing
Sanger sequencing was performed in order to (1) validate mutations detected by NGS screening and (2) distinguish somatic vs. germline mutations. For the second purpose, DNA samples extracted from archived non-malignant gastric mucosa of the same patients were used. The direct sequencing of individual PCR products from primers that flank areas of specific mutations were performed on the automatic genetic analyzer ABI PRISM 3500 (ThermoFisher, Waltham, MA, USA) according to the manufacturer's protocols.

Statistical Analysis
Samples were compared using Fisher's exact test. For more than 3 groups comparison Chi-squared test was used. Overall survival probability (OS) was calculated by the Kaplan-Meier product-limit method from the date of surgery till death by any cause and compared statistically using Mantel-Haenszel (log-rank) test. A groupwise mutual exclusivity test was carried out using the DISCOVER (Discrete Independence Statistic Controlling for Observations with Varying Rates) method, which is based on overall tumor-specific alteration rates to decide if alterations co-occur more or less than expected by chance and preventing spurious associations in co-occurrence detection with increasing statistical power to detect mutual exclusivities [10]. All calculations were conducted using R version 3.6.3 [R Core Team (2020). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/ accessed on 7 August 2021].

Pathogenicity Prediction for Novel Mutations
To predict the pathogenicity of identified novel missense variants, a combination of PolyPhen2, PROVEAN, SIFT, and MutPred2 tools was used. I-Mutant 3.0 software was used to calculate the stability of the mutant protein. Loss of protein function effects were assessed with MutPred-LOF software. The effect of nonsynonymous substitutions on the structure was illustrated using the Project HOPE3D portal.

The Spectrum of Detected Somatic Mutations
Using a targeted NGS panel for 25 epigenetic regulation genes, we performed mutational profiling in 135 tumor samples obtained from patients with GC. Our panel included the DNMT1, MBD1, TET1, DNMT3A, DNMT3B genes that control DNA methylation/demethylation; the EZH2, UTX, EP300, JARID1B, CREBBP, HDAC2, SIRT1, KMT2A, KMT2D, and KMT2C genes encoding histone modifiers; and the SMARCB1, SMARCA2, SMARCA4, ARID1A, ARID2, BRD7, PBRM1, CHD5, CHD7, CHD4 genes responsible for chromatin remodeling. Mapped data depth and coverage for each sample are presented in Table S5. For the analysis, we selected missense substitutions that were not annotated in the ClinVar, COSMIC, dbSNP databases and/or substitutions with a population frequency of MAF < 0.0005, as well as nonsense mutations and frameshift mutations. A total of 79 different mutations found in our cohort fulfilled the selection criteria. The variant allele frequency, total read depth, reference, and variant allele read depths, etc., for each of these mutations, are presented in Table S1. No appropriate mutations were found in the DNMT1, DNMT3A, EZH2, UTX, SMARCB1 and SIRT1 genes. The identified mutations and their characteristics are presented in Table 1. In total, we revealed 79 somatic mutations that fulfilled the selection criteria in 49/135 (36%) samples, and no mutations were found in the remaining samples. Among the identified variants, 29/79 were not annotated in dbSNP, 32/79 were not mentioned in gnomAD Exomes, and 68/79 were not mentioned in ClinVar.
The largest number of mutations was determined in histone modifier genes (41), and in chromatin remodeling genes (37). The smallest number was in DNA methylation/demethylation genes (8). Taking into consideration variation in the gene size, we normalized the mutation numbers in these three groups. Of the genes under study, histone modifier genes contained collectively 24,207 codons; chromatin remodeling genes, 16,891 codons, and DNA methylation/demethylation genes, 6188 codons. Thus, frequencies of mutations in these three groups were 0.0017, 0.0022, and 0.0013 per codon, respectively. These figures support a somewhat lower somatic mutation burden on the DNA methylation/demethylation genes, although the differences were not statistically significant. The distribution of variants in the epigenetic regulation genes in our patient samples was as follows: KMT2D-16, ARID1A-9, KMT2A-8, KMT2C-8, CHD7-7, CHD5-5, CHD4, CREBBP-4 each, ARID2, SMARCA2, SMARCA4, DNMT3B and TET1-3 each, HDAC2, EP300, BRD7 and MBD1-2 each, and PBRM1, JARID1B-1. In 23/49 samples, a combination of more than one mutation in different genes was demonstrated, but mutations in KMT2D, KMT2C, ARID1A, and CHD7 were significantly rarely found in one and the same sample (p = 0.038).

Pathogenicity Analysis of the Detected Mutations by Prediction Programs
For all novel mutations that fulfilled the selection criteria, pathogenicity analysis was performed by using prediction programs. By in silico analysis of pathogenicity for somatic alterations, we determined that 15/63 alterations were pathogenic according to more than two prediction tools. PolyPhen2-HumDiv predicted 26 of those as 'Probably damaging', and the other 15 were 'Possibly Damaging', whereas PolyPhen2-HumVar predicted 17 alterations as 'Probably damaging', 11 alterations as 'Possibly damaging', while other 35 alterations were 'Benign'. However, it should be noticed that PolyPhen2-HumVar is more effective in mutations pathogenicity prediction for Mendelian disorders. 26/63 somatic alterations were predicted as 'Deleterious' by PROVEAN prediction tool; 42/63 variants were indicated as 'Damaging' by SIFT.
MutPred2 and MutPred-LOF are machine learning approaches, which incorporate genetic and molecular information to predict whether the alteration is pathogenic or not. We assigned a threshold value of 0.68 for pathogenic, as recommended by developers, because it yields a false positive rate of 10%. With this assumption, 11/63 somatic missense variants were predicted as pathogenic, as well as and 10/16 nonsense and frameshift variants, by MutPred-LOF with a cut-off value of 0.50 (as recommended for MutPred-LOF).
I-Mutant 3.0 predicts protein stability changes based on a protein sequence or protein structure by using a support vector machine training algorithm. The I-Mutant 3.0 predicted a decrease in protein structure stability for 44 somatic alterations and an increase for the other 19 (Table S2).

Analysis of Clinical Significance of Mutations in Epigenetic Regulation Genes
The distribution of mutations in our patient cohort aligned to clinical features is shown in Figure 1. We found no associations of overall somatic mutation status (absence of mutations vs. presence of at least one mutation) of epigenetic regulation genes with gender, age, tumor size, lymph node metastases, stage, anatomical localization, Lauren type, distant metastases, and presence of signet ring cells ( Table 2). As for individual genes, we have only discovered that mutations in ARID1A were associated with distant metastases (p = 0.03).  We found no associations of overall somatic mutation status (absence of mutations vs. presence of at least one mutation) of epigenetic regulation genes with gender, age, tumor size, lymph node metastases, stage, anatomical localization, Lauren type, distant metastases, and presence of signet ring cells ( Table 2). As for individual genes, we have only discovered that mutations in ARID1A were associated with distant metastases (p = 0.03). In the analysis of survival using the Kaplan-Meier method, we found that the overall survival of patients in the group with metastases and the group of tumors with signet ring cells was significantly reduced in the presence of mutations in the epigenetic regulation genes (p = 0.036 and p = 0.042, respectively) comparing with patients without mutations ( Figure 2). In the analysis of survival using the Kaplan-Meier method, we found that the overall survival of patients in the group with metastases and the group of tumors with signet ring cells was significantly reduced in the presence of mutations in the epigenetic regulation genes (p = 0.036 and p = 0.042, respectively) comparing with patients without mutations ( Figure 2). For the group of histone-modifying genes, no significant clinical correlations were found. The group with mutations in the DNA methylation/demethylation genes included only 8 patients and was too small to perform statistical analysis. For the group of histone-modifying genes, no significant clinical correlations were found. The group with mutations in the DNA methylation/demethylation genes included only 8 patients and was too small to perform statistical analysis.

Discussion
Somatic mutations in epigenetic regulation genes are not very common in GC and were determined only in 36% samples (49/135) in our study. Mutations were most rarely detected in genes regulating DNA methylation/demethylation. We have not found any somatic mutations in DNMT1 and DNMT3A. Besides, the group of patients with mutations in the DNA methylation-related genes (MBD1, TET1, DNMT3B) was the smallest one with only 8 out of 135 patients. Such a low frequency may be a result of the cancer type being investigated. Chai-Jin Lee et al. demonstrated that frequencies of somatic mutations in genes associated with DNA methylation and demethylation (DNMT1, DNMT3A, MBD1, MBD4, TET1, TET2 and TET3) significantly varied in different types of cancers. Thus, in myeloid leukemia samples, the frequency of DNMT1 and DNMT3A mutations was high, whereas, in glioblastoma, renal cell carcinoma, and colon carcinoma, the total mutation rate was less than 9% [11]. The low frequency of mutations in the DNA methylation drivers in solid tumors is consistent with our results. Many studies have been published demonstrating DNA methylation as a clinical marker of carcinogenesis; however, the role of somatic mutations in genes regulating methylation/demethylation in solid tumors has not yet been sufficiently investigated. Moreover, although we did not find any DNMT3A mutations in our samples, they were identified in other solid tumors. In 1.2% of papillary thyroid carcinoma cases, mutations and/or loss of DNMT3A expression were associated with aggressive clinical course and poor outcome [12].

Discussion
Somatic mutations in epigenetic regulation genes are not very common in GC and were determined only in 36% samples (49/135) in our study. Mutations were most rarely detected in genes regulating DNA methylation/demethylation. We have not found any somatic mutations in DNMT1 and DNMT3A. Besides, the group of patients with mutations in the DNA methylation-related genes (MBD1, TET1, DNMT3B) was the smallest one with only 8 out of 135 patients. Such a low frequency may be a result of the cancer type being investigated. Chai-Jin Lee et al. demonstrated that frequencies of somatic mutations in genes associated with DNA methylation and demethylation (DNMT1, DNMT3A, MBD1, MBD4, TET1, TET2 and TET3) significantly varied in different types of cancers. Thus, in myeloid leukemia samples, the frequency of DNMT1 and DNMT3A mutations was high, whereas, in glioblastoma, renal cell carcinoma, and colon carcinoma, the total mutation rate was less than 9% [11]. The low frequency of mutations in the DNA methylation drivers in solid tumors is consistent with our results. Many studies have been published demonstrating DNA methylation as a clinical marker of carcinogenesis; however, the role of somatic mutations in genes regulating methylation/demethylation in solid tumors has not yet been sufficiently investigated. Moreover, although we did not find any DNMT3A mutations in our samples, they were identified in other solid tumors. In 1.2% of papillary thyroid carcinoma cases, mutations and/or loss of DNMT3A expression were associated with aggressive clinical course and poor outcome [12].
In our work, the largest number of mutations was detected in histone modification genes (52%, 41/79), with 16 mutations in KMT2D, 8 in KMT2C, and 8 in KMT2A. The proteins encoded by these KMT2 (histone-lysine N-methyltransferases subclass 2) genes were components of a COMPASS-like complex that performs mono-, di-, and trimethylation of lysine 4 (H3K4) in histone 3 and is associated with transcription activation, facilitating access of transcription factors to the promoter and enhancer regions of genes [13]. The functions of COMPASS complexes are vitally important for the normal development of an organism, and mutations in genes encoding their protein components are associated with carcinogenesis [14]. KMT2C and KMT2D proteins restrain cell proliferation and could be considered tumor suppressors [15]. In addition to lysine methylation associated with transcription activation, methyltransferases KMT2C and KMT2D play an important role in the maintenance of genomic stability and DNA repair [16]. Besides, these proteins, together with PTIP (PAX transactivation-domain interacting protein), a subunit of the KMT2C/KMT2D complexes, were found to increase the instability and induce the degradation of the MRE11-dependent replication fork in BRCA-deficient cells [17].
The KMT2D and KMT2C genes are among the most frequently mutated in cancers, which is also confirmed by our study. Mutations were detected in various types of solid tumors, such as melanoma, urothelial carcinoma, lung cancer, as well as in esophageal and stomach cancers [18].
In our study, KMT2D mutations had the highest frequency of 12% and were distributed throughout the gene (Figure 4). Mutations of the KMT2D gene are mainly localized in the central part of the gene coding sequence, which corresponds to the protein region between the PHD-finger domain and the SET domain. This is also in concordance with the data obtained by other authors [19]. tumors, such as melanoma, urothelial carcinoma, lung cancer, as well as in esophageal and stomach cancers [18]. In our study, KMT2D mutations had the highest frequency of 12% and were distributed throughout the gene (Figure 4). Mutations of the KMT2D gene are mainly localized in the central part of the gene coding sequence, which corresponds to the protein region between the PHD-finger domain and the SET domain. This is also in concordance with the data obtained by other authors [19]. According to the analysis by pathogenicity prediction programs, one of the novel somatic missense mutations that we identified in the KMT2D gene, p.R3727C, was determined as pathogenic by almost all prediction tools. This substitution results in disruption of the leucine zipper motif, which was necessary for the protein-protein interactions or dimerization [20]. Disruption of the leucine zipper motif seriously alters the function of proteins, which leads to a deregulation of protein interactions and blocking transcription. Directed alterations of the leucine zipper motif are currently created in synthetic proteins that are used as antitumor drugs [21].
The analysis of pathogenicity of unannotated mutations identified by us in KMT2C revealed three mutations (p.R973G, p.M959I, p.C1953Y) that were pathogenic according to three or more prediction tools. The first two of them were located in the PHD-finger domain of the gene, and p.C1953Y was located in the disorder domain. Disorder domains are characterized by high instability, and substitutions in this region can change the protein conformation. Recent studies have demonstrated that around 20% of mutations in cancers are located in these regions, causing abnormalities of protein conformations and functions [22]. Mutations in KMT2C in diffuse GC are associated with epithelial-mesenchymal transition (EMT) and acquisition of the mesenchymal phenotype by cells and are also markers of a poor prognosis [23]. Mutation distribution along the KMT2C gene is shown in Figure 5. According to the analysis by pathogenicity prediction programs, one of the novel somatic missense mutations that we identified in the KMT2D gene, p.R3727C, was determined as pathogenic by almost all prediction tools. This substitution results in disruption of the leucine zipper motif, which was necessary for the protein-protein interactions or dimerization [20]. Disruption of the leucine zipper motif seriously alters the function of proteins, which leads to a deregulation of protein interactions and blocking transcription. Directed alterations of the leucine zipper motif are currently created in synthetic proteins that are used as antitumor drugs [21].
The analysis of pathogenicity of unannotated mutations identified by us in KMT2C revealed three mutations (p.R973G, p.M959I, p.C1953Y) that were pathogenic according to three or more prediction tools. The first two of them were located in the PHD-finger domain of the gene, and p.C1953Y was located in the disorder domain. Disorder domains are characterized by high instability, and substitutions in this region can change the protein conformation. Recent studies have demonstrated that around 20% of mutations in cancers are located in these regions, causing abnormalities of protein conformations and func-tions [22]. Mutations in KMT2C in diffuse GC are associated with epithelial-mesenchymal transition (EMT) and acquisition of the mesenchymal phenotype by cells and are also markers of a poor prognosis [23]. Mutation distribution along the KMT2C gene is shown in Figure 5.
that are used as antitumor drugs [21].
The analysis of pathogenicity of unannotated mutations identified by us in KMT2C revealed three mutations (p.R973G, p.M959I, p.C1953Y) that were pathogenic according to three or more prediction tools. The first two of them were located in the PHD-finger domain of the gene, and p.C1953Y was located in the disorder domain. Disorder domains are characterized by high instability, and substitutions in this region can change the protein conformation. Recent studies have demonstrated that around 20% of mutations in cancers are located in these regions, causing abnormalities of protein conformations and functions [22]. Mutations in KMT2C in diffuse GC are associated with epithelial-mesenchymal transition (EMT) and acquisition of the mesenchymal phenotype by cells and are also markers of a poor prognosis [23]. Mutation distribution along the KMT2C gene is shown in Figure 5. In our study, mutations in the KMT2D and KMT2C were significantly rarely combined in one sample (p = 0.038). There is a hypothesis that mutually exclusive genomic events are functionally related by common biological pathways, and mutually exclusive genes act on the same downstream effectors, thereby demonstrating functional redundancy. Therefore, the aberration of one of these genes is enough to completely disrupt their common pathways [24]. The KMT2D and KMT2C are components of similar COM-PASS complexes that perform the same function. Deregulation of either KMT2C or KMT2D separately can serve as a driver mutation at the early stages of carcinogenesis, leading to changes in the epigenomic landscape. As was demonstrated for bladder cancer, tumor cells with low KMT2C activity experienced a deficiency of DNA repair mediated In our study, mutations in the KMT2D and KMT2C were significantly rarely combined in one sample (p = 0.038). There is a hypothesis that mutually exclusive genomic events are functionally related by common biological pathways, and mutually exclusive genes act on the same downstream effectors, thereby demonstrating functional redundancy. Therefore, the aberration of one of these genes is enough to completely disrupt their common pathways [24]. The KMT2D and KMT2C are components of similar COMPASS complexes that perform the same function. Deregulation of either KMT2C or KMT2D separately can serve as a driver mutation at the early stages of carcinogenesis, leading to changes in the epigenomic landscape. As was demonstrated for bladder cancer, tumor cells with low KMT2C activity experienced a deficiency of DNA repair mediated by homologous recombination and suffer from endogenous DNA damage and genomic instability, and their treatment with the PARP1/2 inhibitor olaparib leads to synthetic lethality [16]. The high frequency of KMT2D and KMT2C mutations in GC and its associations with repair processes allows considering them as targets for tumor treatment using PARP inhibitors, causing the lethality of tumor cells.
We compared our result on mutual exclusivity of KMT2D and KMT2C mutations with other GC mutation databases. Three datasets were acquired using cBioPortal (http://cbioportal.org accessed on 7 August 2021): Gastric Cancer (OncoSG, 2018), Stomach Adenocarcinoma (Pfizer and UHK), and TCGA PanCancer Stomach Adenocarcinoma (STAD). Visual analysis suggested that KMT2D and KMT2C mutations in these datasets were not mutually exclusive ( Figure 6). For statistical analysis, we retained only sequenced samples with mutation data (without Copy-Number Alterations) in all three studies. For the groupwise mutual exclusive test, p-values were as follows: 0.088 for Onco SG, 0.016 for TCGA STAD, and 0.5 for the Pfizer study. Using the wFisher p-value combination method [25] with sample size for each experiment, we obtained the p-value of the mutual exclusive test under the nominal significance level of 0.05 (Figure 6a). Another interesting observation was that considering missense mutations only, mutations in KMT2D and KMT2C visually were almost mutually exclusive in these three datasets, as they were in our study (Figure 6b), although calculated differences did not approach a significance level of 0.05, which we attributed to the sample sizes. In this respect, we paid attention to the studies of bigger sample size, though of another cancer localization, namely, Breast Cancer METABRIC, Nature 2012, and Nat Commun 2016 (2509 samples) and Breast Cancer MSK, Cancer Cell 2018 (1918 samples), and in these datasets, we witnessed obvious mutual exclusivity of somatic mutations in KMT2D, KMT2C, and ARID1A. Although this may be a cancer typespecific observation, we altogether cannot rule out sample size effect and/or peculiarities of mutation detection/interpretation in different studies.
attention to the studies of bigger sample size, though of another cancer localization, namely, Breast Cancer METABRIC, Nature 2012, and Nat Commun 2016 (2509 samples) and Breast Cancer MSK, Cancer Cell 2018 (1918 samples), and in these datasets, we witnessed obvious mutual exclusivity of somatic mutations in KMT2D, KMT2C, and ARID1A. Although this may be a cancer type-specific observation, we altogether cannot rule out sample size effect and/or peculiarities of mutation detection/interpretation in different studies. The ARID1A and CHD7 genes that are related to chromatin remodeling were often mutated in our patient samples. The ARID1A is often mutated in esophageal and gastric cancers and is the canonical cancer gene according to the Cosmic Cancer Gene Census [26]. The proteins encoded by the ARID1A, SMARCA1, SMARCA2, and SMARCA4 are subunits of the conservative multisubunit SWI/SNF complex, which uses the energy of ATP hydrolysis to mobilize nucleosomes and remodel chromatin. The expression of these genes is often deregulated in the esophagus and gastric cancers [27].
ARID1A substitutions that we identified in gastric tumors, not annotated in human mutation databases, namely p.R2236C, p.Q1415H, and p.P1710L, are of interest since they more active response to immunotherapy and a better prognosis of survival for patients with mutations in ARID1A compared to tumors with wild-type ARID1A. ARID1A mutations can serve as a biomarker for the identification of patients with gastrointestinal cancer who are sensitive to immunotherapy [33]. Clinically, the loss of ARID1A expression was correlated with larger tumor size, deeper invasion, lymph node metastasis, and a poor prognosis [34]. In line with these observations, in our study, mutations in ARID1A were associated with distant metastases (p = 0.03).

Conclusions
As a result of somatic mutation profiling of epigenetic regulation genes in GC, we have revealed associations of the presence of such mutations in tumors with a decrease in patient survival and the risk of developing distant metastasis, making the presence of mutations a marker of a poor prognosis. Studying mutations in epigenetic regulation genes can also contribute to the development of new approaches to drug therapy for GC treatment, adding to them PARP inhibitors for the treatment of tumors with mutations in genes of the KMT2 family and immunotherapy for the treatment of tumors with ARID1A mutations. According to our results, this may be a significant group of patients, as the total frequency of mutations in the chromatin remodeling genes and histone modifiers in our sample were approximately 25% of all patients with mutations in epigenetic regulation genes.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/10 .3390/cancers13184586/s1, Table S1: variant allele frequency, total read depth, reference, and variant allele read depths for each mutation, Table S2: in silico analysis of somatic mutations found in gastric cancer samples, Tables S3 and S4: designed coverage of the NGS panel, Table S5: mapped data depth and coverage for each sample.  Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.

Data Availability Statement:
The data presented in this study are available in the article and supplementary files.