Overview of Gene Expression Analysis in Gastric Disease Infected with Helicobacter pylori : CLDN1 and MMP9 Could Be Biomarkers for Early Diagnosis of Gastric Cancer

: Chronic Helicobacter pylori infection produces several lesions in the human stomach, which can progress to chronic atrophic gastritis and gastric cancer. To date, there is very little information on gene expression in chronic atrophic gastritis and its relationship with progression to gastric cancer. In this study, we performed a gene expression analysis during chronic atrophic gastritis in order to identify possible biomarkers that allow an early diagnosis of gastric cancer. We studied biopsies from patients with chronic atrophic gastritis and gastric cancer. The biopsies were analyzed by a gene expression microarray and corroborated by qPCR and validated through immunohistochemistry. Our results revealed that gene expression proﬁles in patients with chronic atrophic gastritis showed molecular changes of the gastric mucosa, leading to gastric cancer. The gene expression proﬁles of CLDN1 , CLDN7 , OLFM4 , C-MYC and MMP9 were more notable from the chronic atrophic gastritis. The gene expression patterns observed in this study allowed the identiﬁcation of CLDN1 and MMP9 proteins as promising biomarkers of early stages of gastric cancer development.


Introduction
Gastric cancer is widespread because it is difficult to diagnose and treat. In 2018, it was the fifth most common neoplasm in the world, causing 782,685 deaths [1], and is the tenth most common cause of death in Mexico [2]. Even when diagnosed early, the only treatment for gastric cancer is gastric resection. Currently, two types of gastric cancer have been identified: the diffuse or poorly differentiated type; and intestinal cancer [3,4]. The principal risk factor for the progression of gastric cancer is chronic infection by Helicobacter pylori [5], which impacts over 50% of the population worldwide. It is established as a lifelong infection in the human stomach after its acquisition during childhood and produces chronic inflammation of the gastric mucosa [6]. Between 2-5% of those infected acquire chronic atrophic gastritis characterized by the loss of acid secretion and increased levels of pepsinogen I and gastrin [6,7].
Intestinal cancer progresses in a series of sequential histological events described by Correa in 1992. The normal mucosa changes in chronic gastritis, leading to atrophic gastritis, intestinal metaplasia, dysplasia, and ultimately gastric cancer [5][6][7]. Gene expression in the early stages of gastric cancer has been studied in response to infection with H. pylori, although most of the research has been conducted with in vitro models of epithelial cell culture, animal models, or both. In a study of human gastric tissue, in which the presence or absence of H. pylori infection is not considered, Hippo et al. evaluated the expression of 6800 genes from 30 samples of gastric cancer and adjacent tissue using high-density oligonucleotide microarrays. These authors found genetic alterations in the expression of genes related to the DNA damage and repair mechanisms, regulation of the cell cycle, activation of oncogenes, and inactivation of the tumor suppressor genes implicated in gastric carcinogenesis [8]. Additionally, Kim et al. investigated the early stages of gastric cancer by examining the expression of 25,100 genes using DNA microarrays in 24 tissue samples from chronic atrophic gastritis, intestinal metaplasia, and normal mucosa, without considering the presence or absence of H. pylori infection. Comparing the characteristic expression profiles of chronic atrophic gastritis with those of intestinal metaplasia disorders, the authors conducted a bioinformatics analysis to compare the profiles with the expression of genes of gastric carcinogenesis [9]. The gene expression patterns found in these studies provided new comparative information in chronic atrophic gastritis and intestinal metaplasia, which may play an important role in the development of gastric cancer. To gain a molecular understanding of H. pylori infection in gastric carcinogenesis, we conducted the analysis of gene expression profiles in chronic atrophic gastritis and gastric cancer of subjects infected with this bacterium. The gene expression patterns observed in this study allowed the identification of CLDN1 and MMP9 proteins as promising biomarkers of early stages of gastric cancer development.

Materials and Methods
To accomplish the study objectives, two groups of samples were used, including gastric biopsies for the microarray analysis and qPCR. To validate the results obtained, we used tissues embedded in paraffin for immunohistochemistry analysis.

Ethical Approval and Consent to Participate
This study was approved by the Investigation and Ethics Committee of the School of Medicine of the UNAM, National Institute of Medical Sciences and Nutrition Salvador Zubirán (INCMNSZ), (Registry numbers: 019-2009 and 209, respectively). All participants gave their written informed consent, in accordance with the 1964 Declaration of Helsinki and its later amendments or comparable ethical standards, prior to sample collection.

Gastric Biopsy Samples
The gastric biopsies were collected from patients with follicular gastritis, chronic atrophic gastritis, and gastric cancer. After obtaining consent, subjects with gastric complaints who were programmed for an exploratory endoscopy to determine the source of their symptoms were recruited. After the endoscopic procedure, body and antrum samples taken for diagnosis were submitted to the Pathology Department, and an expert pathologist performed the histological examination according to the Sydney classification and confirm the presence of H. pylori. Gastric cancer was classified according to its histological type by the Lauren and Macroscopic classification as well as frequency by growth form (tumor type) [10,11]. For the study, patients with an H. pylori infection and a diagnosis of follicular gastritis, chronic atrophic gastritis, and gastric cancer were selected. All the biopsies from the gastric lesions were collected and stored until use in RNAlater (Ambion, Austin, TX, USA) at -70 • C for nucleic acid preservation.

RNA Isolation and Expression Microarray Analysis
The biopsies selected from the exploratory set were removed from RNAlater and put into a lysis solution (RNAqueous Kit; Ambion, Austin, TX, USA). Each gastric biopsy was homogenized with a tissue homogenizer (Cole-Parmer, Vernon Hills, IL, USA) until its complete lysis. Total RNA was isolated from the lysate using the RNAqueous commercial kit (Ambion, Life Technologies, Carlsbad, CA, USA) according to the manufacturer's instructions. The RNA quality was measured with a microvolume spectrophotometer (ND-1000; NanoDrop Technologies, Wilmington, DE, USA). RNA integrity was assessed with 28S:18S ribosomal RNA (rRNA) ratios to calculate the RNA Integrity Number (RIN) using an Agilent 2100 Bioanalyzer (Agilent Technologies, Palo Alto, CA, USA). The microarray assays were carried out using the Human Gene 1.0 ST Array (Affymetrix, Santa Clara, CA, USA) according to the manufacturer's instructions.

Microarray Assays
For microarray experiments, target complementary DNA (cDNA) from each biopsy was prepared according to the WT Expression Sense Target Kit (Ambion, Austin, TX, USA). Briefly, one µg of RNA was converted into first-strand cDNA. Next, second-strand cDNA synthesis was performed, followed by in vitro transcription to generate cRNA. The cRNA products were used as templates for a second-cycle cDNA synthesis where dUTP was incorporated into the new strand. Purified sense-strand cDNA (with incorporated dUTP) was fragmented and labeled using the Affymetrix GeneChip WT Terminal Labeling Kit (Affymetrix, Santa Clara, CA, USA). The cDNA was fragmented using Uracil-DNA Glycosylase (UDG) and Apurinic/Apyrimidic Endonuclease 1 (APE1). The fragments (40-70 mers) were then marked by biotin-labeled deoxynucleotide terminal addition reaction using a terminal deoxynucleotidyl transferase (TdT). Finally, using the GeneChip Hybridization, Wash, and Stain Kit, each fragmented and labeled cDNA sample was hybridized onto Affymetrix Human Gene 1.0 ST arrays (Affymetrix, Santa Clara, CA, USA), respectively. After hybridization, the 13 arrays were washed, stained for biotinylated cDNA, and scanned per the manufacturer's recommendations to obtain CEL files for each microarray.

Microarray Analysis and Selection of Differentially Expressed Genes
Samples were classified into three main groups: (1) samples from patients with Follicular Gastritis [Control group (Ctrl)], (2) samples from patients with Chronic Atrophic Gastritis (CAG), and (3) samples from patients with Gastric Cancer (GC). All possible pairwise comparisons among the three groups generated three comparisons of interest: CAG vs. Ctrl; GC vs. Ctrl, and GC vs. CAG. We performed a low-level data analysis in which the raw microarray data were background-corrected using the robust multi-array average or the RMA method [12] and normalized using the quantile normalization approach [13,14], both executed in R v. 3.1.3 (http://www.cran.r-project.org, accessed 18 March 2018). Differential expression was determined by fitting a linear model on each gene using the Limma package with an empirical Bayesian approach [15,16]. Correction for multiple hypotheses was applied by controlling the false discovery rate (FDR). Genes were selected as differentially expressed based on |logFCh| ≥ 0.85 and statistical significance considering a B-statistic ≥ 1 with associated FDR adjusted p-values ≤ 0.01. The expression matrix created with this procedure was employed for the selection of differentially expressed genes, the enrichment analyses, and the evaluation of the functional properties of the genes.

Enriched Gene Ontology Terms
Three tools were utilized to evaluate the functional properties and pathway analysis of the differentially expressed genes. We used gene annotation enrichment analysis within the set of significant genes, employing the Database for Annotation, Visualization and Integrated Discovery bioinformatics tool (DAVID) v6.7 (http://david.abcc.ncefcrf.gov/, accessed 7 October 2018), of the National Institute of Allergy and Infectious Diseases, NIH (NIAID) [17,18]. Enrichment analysis was also conducted using the Ingenuity Pathway  [19,20].

Amplification of Gene Expression by qPCR
Total RNA from each sample of the exploratory set was reverse transcribed using a SuperScript III Reverse Transcriptase Kit (Invitrogen, Carlsbad, CA, USA). The resulting complementary DNA (cDNA) was used in a real-time polymerase chain reaction (qPCR) to validate the genes of interest. We employed TaqMan probes and primers for claudin-1 (CLDN1), claudin-7 (CLDN7), matrix metalloproteinase-9 (MMP9), MYC (C-MYC), and olfactomedin-4 (OLFM4) from Applied Biosystems (Carlsbad, CA, USA) (assay ID: Hs00221623_m1; Hs00600772_m1; Hs00234579_m1; Hs00905030_m1, and Hs00197437_m1, respectively). The expression assays for each gene were conducted in triplicate on 96-well optical plates employing the ABI Prism 7000 sequence detection system and ABI Prism ® 7000 SDS 1.2.3 with RQ (Applied Biosystems). The typical protocol utilized was the following: initial denaturing for 2 min at 95 • C, 40 cycles at 95 • C for 10 min, 95 • C for 15 s, and 60 • C for 1 min, and then a final extension at 72 • C for 7 min. Actin beta gene (ACTB, assay ID: Hs99999903_m1) was used as an internal control for the expression of independent sample-to-sample variability. Each gene-of-interest and the housekeeping gene were tested in triplicate. Relative gene expressions were determined from the obtained cycle threshold (Ct) values and the 2 −∆∆Ct method.

Validation of Gene Assays
To validate the results obtained in the gene expression microarray and qPCR assays, the archives of the Pathology Department from INCMNSZ were reviewed to obtain pathological records and paraffin-embedded gastric biopsies (FFPE) samples from the past 10 years. We studied a total of 82 gastric tissues: 18 gastric biopsies without histological alterations, 20 of chronic follicular gastritis, 20 of chronic atrophic gastritis and 24 of gastric cancer. In addition, liver explant and right colon tissues were included as positive controls. A gastric biopsy from each study group were used as a negative control, in which the primary antibody was omitted. Glass slides stained with hematoxylin and eosin were checked for histological types according to Lauren classifications [3]. Tumor samples were also obtained from patients undergoing gastric surgical resection. All samples were verified by a second pathologist. From the FFPE blocks corresponding to gastric tissues, serial 3-µm slices were taken from each block to perform an immunohistochemical analysis of CLDN1 and MMP9.

Immunohistochemistry of CLDN1 and MMP9
Descriptive analysis for the claudins and metalloproteinases in each study group was conducted. A positive signal indicating the expression of CLDN1 and MMP9 outside and inside the cells (positive signal either in the plasma membrane in the case of CLDN1 and in the extracellular matrix or in the nucleus for MMP9) was analyzed, and the distribution of the signal along the gastric tissue was evaluated. In brief, the FFPE samples from the validation set were dewaxed and rehydrated. Anti-CLDN1 antibody (DBS, Pleasanton, CA, USA) diluted at 1:100 was used for CLDN detection. Anti-MMP9 [MMP9/2025R] (Abcam, Cambridge, UK) diluted at 1:500 was used for MMP detection. The Mouse/Rabbit Inmun-oDetector HRP/DAB Detection System (Bio SB, Santa Barbara, CA, USA) was employed to conduct the immunohistochemistry (IHC). Tissue sections were briefly counterstained with hematoxylin and observed under the Olympus BX61VS microscope; images were acquired with Olympus OlyVIA 2.9 (Build 13771) software (Tokyo, Japan, Model BX-UCB). The negative control for the IHC was established by omitting the primary antibody against CLDN1 and MMP9.

Statistical Analysis
Clinical characteristics of patients were presented as mean ± standard error (SE) using statistical software (GraphPad Prism version 5.00 for Windows, GraphPad Software, San Diego, CA, USA). Statistical significance was determined by p-value (p ≤ 0.05). To confirm whether the genes were present in all gastric pathologies that proceed to gastric cancer, we used the Kruskal-Wallis test.

Clinical Characteristics of Enrolled Patients
A total of 13 gastric biopsies were histopathologically diagnosed. There were seven patient controls with follicular gastritis (aged 45 ± 3.6 years), three with chronic atrophic gastritis (aged 67 ± 1.76 years) and three with gastric cancer (aged 59 ± 10.07 years). The mean age of patients was 48 ± 3.6 years for all groups, Table 1 presents the clinical features of the patients in this study.

Differential Expression of Genes between the Study Groups
By studying the differential expression of genes by microarray assays, we observed a volcano plot in chronic atrophic gastritis as compared to the control group (CAG vs. Ctrl). There were 42 subexpressed genes and 187 overexpressed genes in CAG (Supplementary Figure S1A). The differential expression of genes in gastric cancer vs. the control group, displayed 231 subexpressed genes and 310 overexpressed genes in gastric cancer in comparison with the control group (Supplementary Figure S1B). Finally, we observed 39 subexpressed genes and 8 overexpressed genes in gastric cancer in comparison with chronic atrophic gastritis (GC vs. CAG) (Supplementary Figure S1C). In Supplementary Table S1, the lists of genes with differential expression are shown from the comparisons of the different study groups (CAG vs. Ctrl; GC vs. Ctrl; and GC vs. CAG).

Differential Expression of Genes between Chronic Atrophic Gastritis and Gastric Cancer
When we make the comparison between the overexpressed genes in CAG and GC, we found that 51 of these genes were overexpressed in both comparisons. On the other hand, of the comparison made with the subexpressed genes in both pathologies (CAG vs. GC), we found that 21 of these genes were expressed in both comparisons (Supplementary Figure S2A,B). When comparing lists of genes of CAG vs. Ctrl with respect to the gene list of GC vs. CAG, we found five overexpressed genes in CAG, that were subexpressed in GC (TM4SF5, CIDEC, REB3B, APOBEC1 and GP2) (Supplementary Figure S2C). Interestingly, when looking for the differences that we found in the comparison of the genes overexpressed in GC and the genes overexpressed in the contrast GC vs. CAG, we found four genes in common. When comparing subexpressed genes in GC vs. Ctrl with the subexpressed genes in GC vs. CAG, there were 21 genes expressed in both comparisons (Supplementary Figure S2D,E). The heat maps depict a clear separation in the gene expression of the Ctrl group and those of the CAG group and GC group. Additionally, the expression profiles of GC vs. CAG clearly demonstrate a different gene expression between the two clinical entities (Supplementary Figure S3).

Differential Expression of Genes between Chronic Atrophic Gastritis and the Control
The comparison between CAG vs. Ctrl identified a total of 229 genes with a significant difference in their expression of transcripts (187 overexpressed and 42 subexpressed), which are likely involved in the progress of gastric cancer. In the analysis with Gene Set Enrichment Analysis (GSEA), in chronic atrophic gastritis compared with control group, 727 of 995 genes were found to be overexpressed in the group of CAG (CAG vs. Ctrl). The gene set from the M phase of the mitotic cell cycle exhibited significant enrichment in the group of CAG, with a false match contribution of <25% (FDR q-value < 0.25). The remaining biological functions, such as the condensed chromosome, mitosis, spindle, Golgi transport vesicles, and digestion, obtained a nominal p-value < 0.05 (p-value NOM), but this value was not adjusted for the size of each gene set. Note that the most overexpressed pathways in CAG are part of cell-cycle arrest (Supplementary Table S2). With the GO enrichment analysis of the 229 genes identified, we were able to categorize the functionally relevant gene clusters. As such, the 229 transcripts were associated with drug metabolism, caffeine metabolism, arginine and proline metabolism, and the peroxisome proliferator-activated receptor (PPAR) signaling pathway ( Table 2). All these finding suggest that some of the most important cellular functions and processes are elevated in CAG. To understand the functional relationship of differentially expressed genes in CAG as a function of biological processes, pathways, and networks, IPA core analyses were performed to obtain a high score network (score > 15). These scores derived from p-values Processes 2022, 10, 196 7 of 18 indicate the probability that their appearance in the network is due to background noise. The analyses revealed that CAG is a disease in which genes are significantly related to injuries and abnormalities of the body, cardiovascular diseases, psychological disorders, and gastrointestinal diseases (Supplementary Figure S4A). Ten overexpressed genes were identified (OLFM4, CLCA1, SI, TMPRSS15, CDH17, TM4SF20, DMBT1, REG4, ANPEP, and FABP2); of which the OLFM4 has a significant increase in its expression, while 10 other genes were found to be subexpressed (KCNJ16, ATP4B, ATP4A, CPA2, AQP4, IDL, CLIC6, CCKAR, GPR155 and MFSD4) ( Figure 1A). In other hand, the network with the highest score (score = 24) of functional cell movement, tissue growth, and development and function of the digestive system related to CAG vs. Ctrl, the CLDN7 shows an important overexpression ( Figure 2). These results suggest that the alteration of OLFM4 and CLDN7 may be associated with the development of CAG.
(Maximum probability). The smaller, the more enriched.
To understand the functional relationship of differentially expressed genes in CAG as a function of biological processes, pathways, and networks, IPA core analyses were performed to obtain a high score network (score > 15). These scores derived from p-values indicate the probability that their appearance in the network is due to background noise The analyses revealed that CAG is a disease in which genes are significantly related to injuries and abnormalities of the body, cardiovascular diseases, psychological disorders and gastrointestinal diseases (Supplementary Figure S4A). Ten overexpressed genes were identified (OLFM4, CLCA1, SI, TMPRSS15, CDH17, TM4SF20, DMBT1, REG4, ANPEP and FABP2); of which the OLFM4 has a significant increase in its expression, while 10 other genes were found to be subexpressed (KCNJ16, ATP4B, ATP4A, CPA2, AQP4, IDL CLIC6, CCKAR, GPR155 and MFSD4) ( Figure 1A). In other hand, the network with the highest score (score = 24) of functional cell movement, tissue growth, and development and function of the digestive system related to CAG vs. Ctrl, the CLDN7 shows an important overexpression (Figure 2). These results suggest that the alteration of OLFM4 and CLDN7 may be associated with the development of CAG.

Significant Molecular Differences between Gastric Cancer and the Control Group
When we compared GC vs. Ctrl, we found 541 differentially transcripted genes, of which 310 were overexpressed and 231 were subexpressed transcripts in GC. Using the GSEA, we found 746 out of 995 cell-cycle related genes, which demonstrated significant enrichment in the GC group with a false match contribution of <25% (FDR q value < 0.25). Additionally, important biological functions such as a response to the stimulation of DNA damage, metalloendopeptidase activity, mitosis, the extracellular matrix, and extracellular matrix proteins were also found. It is noteworthy that in gastric cancer, the upregulated genes form parts of cell cycle regulation and cell migration structures (Supplementary Table S3). The GO enrichment analysis showed 541 transcripts that are relevant for the progress of gastric cancer. As illustrated in Table 3, the molecular differences between GC vs. Ctrl are related to arrhythmogenic right ventricular cardiomyopathy, arginine and proline metabolism, nitrogen metabolism, and tight junctions.

Significant Molecular Differences between Gastric Cancer and the Control Group
When we compared GC vs. Ctrl, we found 541 differentially transcripted genes, of which 310 were overexpressed and 231 were subexpressed transcripts in GC. Using the GSEA, we found 746 out of 995 cell-cycle related genes, which demonstrated significant enrichment in the GC group with a false match contribution of <25% (FDR q value < 0.25). Additionally, important biological functions such as a response to the stimulation of DNA damage, metalloendopeptidase activity, mitosis, the extracellular matrix, and extracellular matrix proteins were also found. It is noteworthy that in gastric cancer, the upregulated genes form parts of cell cycle regulation and cell migration structures (Supplementary  Table S3). The GO enrichment analysis showed 541 transcripts that are relevant for the progress of gastric cancer. As illustrated in Table 3, the molecular differences between GC vs. Ctrl are related to arrhythmogenic right ventricular cardiomyopathy, arginine and proline metabolism, nitrogen metabolism, and tight junctions.   Network analysis of the biological interaction of genes with a significant difference in their expression with IPA showed that GC is an important disease because more than 400 molecules were identified and grouped into organismal abnormalities, injuries, gastrointestinal diseases, endocrine system disorders, and reproductive system disease (Supplementary Figure S4B). We identified 10 overexpressed genes (CDH17, OLFM4, MUC17, SI, TM4SF20, MUC12, CLDN7, ANPEP, CST1 and CLDN1) and 10 that were subexpressed (IDL, GKN2, GKN1, PGA5, PGC, GIF, ATP4B, VSIG1, TFF2, and ATP4A) ( Figure 1B). All these genes had the highest score (score = 38) and were grouped into organ development and function of the renal, urinary and endocrine systems (Supplementary Figure S5). These results indicate an increase in expressions of CLDN1 and CLDN7 genes and the interaction network of their products are associated with the progression of GC.

Biological Difference between Gastric Cancer and Chronic Atrophic Gastritis
The gene expression analyses of GC vs. CAG revealed important differences between these two stages. There were 47 different transcripts in gastric cancer, 8 of which were overexpressed while the remaining 39 were subexpressed. The GSEA in GC demonstrated enrichment in five biological components, such as extracellular matrix proteins, extracellular matrix parts, skeletal development, collagen, and the basement membrane; we found that only the extracellular matrix presented significant enrichment with a false match contribution of <25% (FDR q-value < 0.25). It is important to note that in GC, overexpressed biological functions are part of cell cycle regulation and cell migration structures (Supplementary Tables S3 and S4). The GO enrichment analysis showed 47 transcripts with relative prominence in GC. Table 4 presents the molecular differences between GC and CAG, which include linoleic acid, fructose, and mannose metabolism. This is consistent with the fact that high level of linoleic acid in the diet of mice increases peritoneal metastasis and the invasion of gastric carcinoma cells [21]. Conversely, a metabolic pathway analysis revealed significant perturbations of fructose and mannose metabolism in human colorectal cancer [22]. Network analysis of the biological interaction between differentially expressed gene products in GC vs. CAG with IPA revealed that cancer is a disease with a significant set of genes linked to injuries and abnormalities of the body, diseases of the reproductive system, respiratory disease, and disorders of the endocrine system (Supplementary Figure S4C). In this analysis, 8 genes were overexpressed (CST1, MXRA5, PCDHB9, SNX10, MMP9, C5AR1, SOD2 and TNFSF4) and 10 were subexpressed (GKN2, GKN1, PGC, VSIG1, DPCR1, TFF2, MUC6, TFF1, UPK1B, and CXCL17) ( Figure 1C). The network with the highest score (score = 19) of functional gene expression, cell growth and proliferation, and organismal injury and abnormalities related to GC vs. CAG, the MMP9 shows an important overexpression ( Figure 3). These results indicated that alterations in MMP9 expression could be related to the progress of gastric cancer. important overexpression (Figure 3). These results indicated that alterations in MMP9 expression could be related to the progress of gastric cancer.

Changes in Gene Expression in Chronic Atrophic Gastritis Tissue May Be Involved in Carcinogenesis
We performed real-time PCR of five genes of biological interest associated with tight junctions, the cell cycle, extracellular matrix breakdown, intestinal inflammation, and tumorigenesis. The genes analyzed were: CLDN1, CLDN7, C-MYC, MMP9, and OLFM4. The results showed that the expression of CLDN1, CLDN7, and OLFM4 was low in CAG and GC, while the expression of C-MYC and MMP9 was elevated in both cases. These results differ from those obtained from the microarrays (Figure 4). Table 5 lists the genes analyzed in microarrays and RT-qPCR. Fold change (FC) was calculated by |logFCh| ≥ 0.85 and statistical B-statistic ≥ 1. The results of the qPCR were normalized for the PCR product for ACTB using the 2 −ΔΔCt comparative method. The data obtained show the relative overexpression of C-MYC and MMP9 in CAG and GC in relation to housekeeping gene ACTB.

Changes in Gene Expression in Chronic Atrophic Gastritis Tissue May Be Involved in Carcinogenesis
We performed real-time PCR of five genes of biological interest associated with tight junctions, the cell cycle, extracellular matrix breakdown, intestinal inflammation, and tumorigenesis. The genes analyzed were: CLDN1, CLDN7, C-MYC, MMP9, and OLFM4. The results showed that the expression of CLDN1, CLDN7, and OLFM4 was low in CAG and GC, while the expression of C-MYC and MMP9 was elevated in both cases. These results differ from those obtained from the microarrays (Figure 4). Table 5 lists the genes analyzed in microarrays and RT-qPCR. Fold change (FC) was calculated by |logFCh| ≥ 0.85 and statistical B-statistic ≥ 1. The results of the qPCR were normalized for the PCR product for ACTB using the 2 −∆∆Ct comparative method. The data obtained show the relative overexpression of C-MYC and MMP9 in CAG and GC in relation to housekeeping gene ACTB.  Given this, we sought the protein expression of CLDN1 and MMP9 in histopathological lesions of the stomach, as well as in tissues without lesions. To achieve this, we performed a qualitative analysis of these proteins to determine their distribution within the cell and along the gastric tissue, which has not been previously performed. Although the differences we found were between GC and CAG, we also determined their expression in tissue without histological lesions. Data regarding age, gender, histopathological results, and pathological location can be found in the Table 6. In tissues with CAG 19/20 (95%) and GC 24/24 (100%), we detected a positive signal for CLDN1, mainly in the plasma membrane of the gastric tissue, despite the localization (antrum or body). Another finding was that CLDN1 expression increases in the biopsy of the ulcerated gastric antrum in well and poorly differentiated gastric cancer ( Figure 5, and Supplementary Table S5). An interesting finding was that CLDN1 expression begins from non-metaplasic chronic atrophic gastritis, and this expression increases progressively in all the histological stages that eventually lead to the development of GC (Figure 3 and Supplementary Table S5). The signal was found mostly in the extracellular space and in polymorphonuclear cells. In the tissues corresponding to CAG 16/20 (80%), we found an evident positive signal in epithelial and inflammatory cells ( Figure 5, and Supplementary Table S5). In CAG 16/20 (80%), we found an increase in the expression of MMP9 in the extracellular space, and this expression increased in GC 24/24 (100%) ( Figure 5 and Supplementary Table S5).  Given this, we sought the protein expression of CLDN1 and MMP9 in histopathological lesions of the stomach, as well as in tissues without lesions. To achieve this, we performed a qualitative analysis of these proteins to determine their distribution within the cell and along the gastric tissue, which has not been previously performed. Although the differences we found were between GC and CAG, we also determined their expression in tissue without histological lesions. Data regarding age, gender, histopathological results, and pathological location can be found in the Table 6. In tissues with CAG 19/20 (95%) and GC 24/24 (100%), we detected a positive signal for CLDN1, mainly in the plasma membrane of the gastric tissue, despite the localization (antrum or body). Another finding was that CLDN1 expression increases in the biopsy of the ulcerated gastric antrum in well and poorly differentiated gastric cancer ( Figure 5, and Supplementary Table S5). An interesting finding was that CLDN1 expression begins from non-metaplasic chronic atrophic gastritis, and this expression increases progressively in all the histological stages that eventually lead to the development of GC (Figure 3 and Supplementary Table S5). The signal was found mostly in the extracellular space and in polymorphonuclear cells. In the tissues corresponding to CAG 16/20 (80%), we found an evident positive signal in epithelial and inflammatory cells ( Figure 5, and Supplementary Table S5). In CAG 16/20 (80%), we found an increase in the expression of MMP9 in the extracellular space, and this expression increased in GC 24/24 (100%) ( Figure 5 and Supplementary Table S5).

Discussion
In this study, we identified 770 genes differentially expressed in CAG vs. GC associated with H. pylori infection by analyzing microarray assays. GSEA, DAVID, and IPA identified several bio-functions of genes (Supplementary Figure S4) and the interaction networks for genes expressed differentially in each study group (Figures 3 and 4, and Supplementary Figure S5). Our data also suggests that multiple signaling pathways (NF-kB, RAS and C-MYC) and genes (CLDN1, CLDN7, MMP9, OLFM4 and C-MYC) could participate in the development of gastric cancer.
Among 541 genes found only in gastric cancer (GC vs. Ctrl) and 229 genes found in chronic atrophic gastritis (GCA vs. Ctrl), the genes (CDH17, OLFM4, MUC17, SI, TM4SF20, MUC12, CLDN7, ANPEP, CST1 and CLDN1) were found to be overexpressed in GC. This finding agrees with other studies which suggest that the overexpression of C-MYC [23,24], CLDN1, and CLDN7 [25][26][27][28] is associated with the progression of GC. In the case of chronic atrophic gastritis (GCA vs. Ctrl), we found that the genes (OLFM4, CLCA1, SI, TMPRSS15, CDH17, TM4SF20, DMBT1, REG4, ANPEP and FABP2) were overexpressed, suggesting that the alteration of OLFM4 and CLDN7 may be associated with the development of CAG. When we performed the third contrast (GC vs. CAG), we found eight overexpressed genes (CST1, MXRA5, PCDHB9, SNX10, MMP9, C5AR1, SOD2 and TNFSF4). This result indicates that the significant increase in MMP9 expression is likely related to the progression of GC. Interestingly, by confirming all these data by qPCR of the genes induced in CAG (CLDN7 and OLFM4) and GC (CLDN1, MMP9 and C-MYC), we found that the genes mainly expressed in CAG and GC were C-MYC and MMP9 (Figure 4).
It is well known that the claudin family plays a crucial role in the structure of tightjunction function in normal epithelial cells. The expression profile found in GC vs. Ctrl exhibits an increase in the claudin family ( Figure 1B). Although there are several reports on CLDN1 expression in gastric cancer, there is still no agreement on the relationship between this expression and clinicopathological parameters [25]. CLDN1 produces epithelialmesenchymal transition (EMT) through activation of the c-Abl-ERK signaling pathway. In contrast, its overexpression in intestinal-type gastric cancer cogenerates an increase in lymph node metastases and tumor-node-metastasis (TNM) stages in patients with intestinal-type gastric cancer [26]. Other reports propose that the overexpression of CLDN1 is related to the carcinogenesis of invasive and metastatic gastric cancer [25,27]. In our study, CLDN1 was one of the upregulated genes that we found in H. pylori-exposed gastric epithelial cells, as previously observed [29], indicating a probable relationship between chronic exposure to H. pylori and CLDN1 upregulation in gastric mucosa. We also found that CLDN7 was upregulated in the tumor samples and that this gene could also be involved in gastric carcinogenesis [28,[30][31][32], suggesting that this bacterium could regulate, in some way, the expression of CLDN1 and CLDN7.
On the other hand, CLDN1 is considered as a marker of epithelial-to-mesenchymal transition (EMT) [33] which intervenes in cellular processes such as migration, invasion, and matrix metalloproteinase (MMP) activation [34]. Previous reports indicated that CLDN1 inducted and activated MMP2, which improve cell invasion and metastasis [35,36].
The matrix metalloproteinase family is important because it is involved in degrading components of the extracellular matrix (ECM). It also participates in many functions, such as physiological and pathological events, and stomach diseases such as gastritis and gastric cancer [37][38][39] (Figure 3). The MMP family also regulates the immune response, angiogenesis, invasion, cell growth, survival, and the EMT [39]. Various studies have reported high levels of MMP9 in humans with gastrointestinal cancer through immunoassays and the observation of enzymes by electrophoretic techniques. In a TGF-β-signaling-deficient colon cancer model, tumor cells are capable of secreting CCL9 that induces the enrollment of CCR1+ myeloid cells, producing MMP2 and MMP9 and facilitating the invasive growth of tumors [40]. IL-1 induces the expression of genes (VEGF, IL-6, IL-8, TGF-β and MMP) involved in metastasis and inflammation [41]. It has also been reported that IL-8 is able to increase the expression of MMPs by developing metastases [42]. Another study showed that some human primary tumors are capable of recruiting neutrophils that secrete MMP9 and favor angiogenesis and intravasation of cancer cells [43]. It is interesting to note that in post-surgical infections, these can activate neutrophil-producing traps (NET) containing high levels of active MMP9 [44]. Tumor-associated macrophages are the main producers of proteases such as cathepsin and MMP, which can degrade the ECM, generating a tumor microenvironment and promoting the development of metastases [45]. For example, under hypoxic conditions, prostate-cancer cells are capable of releasing exosomes charged with proinflammatory cytokines such as TNF, IL-6, proteinases MMP2 and MMP9, which enhance the invasiveness and metastasis of cancer cells [46]. According to the results of these reports, our study shows that the expression of MMP9 messenger RNA (mRNA) is significantly increased in the GC vs. Ctrl; the same occurs with C-MYC but to a lesser extent ( Figure 4). Through an in-depth analysis of MMP9 expression in follicular gastritis, chronic atrophic gastritis, and gastric cancer, we found MMP9 expressed in polymorphonuclear cells that participate in the inflammation of lesions in gastritis. This signature could potentially be associated with H. pylori infection. Interestingly, both MMP9 and CLDN1 began to express themselves from chronic atrophic gastritis and increased in gastric cancer ( Figure 5, Table 6, and Supplementary Table S5). In this way, CAG could be associated with a greater progression to malignant lesions.
Our results suggest that H. pylori can use several mechanisms to interact with the gastric mucosa, which could eventually lead to the development of atrophy or gastric cancer. For example, CagA virulence factors and the peptidoglycan of H. pylori induce signaling cascades of NF-κB, RAS, MEK, ERK, and C-MYC, resulting in the increased transcription of CLDN1, CLDN7, OLFM4, C-MYC, and MMP9 genes, which promote cell proliferation, differentiation, survival, and eventually gastric carcinogenesis. In Figure 6A, we present interactions between genes and the signaling cascade that is activated to favor the overexpression of CLDN7 and OLFM4. H. pylori by CagA virulence factors can interact with NF-κB and generate greater CLDN7 expression; chronic inflammation may also contribute to the overexpression of the claudin family. The peptidoglycan of H. pylori is recognized by NOD1, activates NF-κB, and increases the expression of OLFM4 in chronic atrophic gastritis. In the gastric cancer scaffold ( Figure 6B), we observed that CagA could interact with NF-κB, RAS, and C-MYC to favor the expression of genes such as CLDN1, CLDN7, C-MYC, and MMP9. The augmented expression of CLDN1 and C-MYC could increase cell proliferation, whereas the increased expression of MMP9 causes deterioration of the ECM that promotes cell migration and invasion in intestinal gastric cancer.

Conclusions
In this study the gene expression patterns found through an initial exploratory set of biopsies patients with H. pylori infection and a diagnosis of CAG and GC were analyzed by gene expression microarray; analyzing this data by qPCR, plus a validation assay in FFPE samples by immunohistochemistry, revealed several groups of genes in patients with CAG and GC. These genes could be related to several pathways, such as cell movement, the development of tissue, and the development and function of the digestive system (Supplementary Figure S4A). Gastric cancer could also be related to development and function of renal/urinary and endocrine systems (Supplementary Figure S4B). In contrast, CAG was specifically related to cell growth and proliferation and abnormalities of the body injury pathway. Therefore, H. pylori infection can alter cellular gene expression processes, increase the inflammatory immune response that activates NF-kB and C-MYC signaling pathway, and ultimately trigger gastric carcinogenesis. In conclusion, this study used exhaustive gene expression analysis in CAG and GC identified the CLDN1 and MMP9 proteins as promising biomarkers of the early stages of gastric cancer development.
Supplementary Materials: The following are available online at www.mdpi.com/xxx/s1, Figure S1: Volcano plot showing the differential gene expression and the statistical significance, Figure S2: The five types of diagrams obtained from the differential expression of genes between chronic atrophic gastritis and gastric cancer, Figure S3: Heat map showing the hierarchical clustering of microarray data, Figure S4: Graph of the main diseases and biological functions where the differentially expressed genes in each group are grouped, Figure S5: Network of functional organ development, renal and urological system development and function, and endocrine system development and

Conclusions
In this study the gene expression patterns found through an initial exploratory set of biopsies patients with H. pylori infection and a diagnosis of CAG and GC were analyzed by gene expression microarray; analyzing this data by qPCR, plus a validation assay in FFPE samples by immunohistochemistry, revealed several groups of genes in patients with CAG and GC. These genes could be related to several pathways, such as cell movement, the development of tissue, and the development and function of the digestive system (Supplementary Figure S4A). Gastric cancer could also be related to development and function of renal/urinary and endocrine systems (Supplementary Figure S4B). In contrast, CAG was specifically related to cell growth and proliferation and abnormalities of the body injury pathway. Therefore, H. pylori infection can alter cellular gene expression processes, increase the inflammatory immune response that activates NF-kB and C-MYC signaling pathway, and ultimately trigger gastric carcinogenesis. In conclusion, this study used exhaustive gene expression analysis in CAG and GC identified the CLDN1 and MMP9 proteins as promising biomarkers of the early stages of gastric cancer development.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/ 10.3390/pr10020196/s1, Figure S1: Volcano plot showing the differential gene expression and the statistical significance, Figure S2: The five types of diagrams obtained from the differential expression of genes between chronic atrophic gastritis and gastric cancer, Figure S3: Heat map showing the hierarchical clustering of microarray data, Figure S4: Graph of the main diseases and biological functions where the differentially expressed genes in each group are grouped, Figure S5: Network of functional organ development, renal and urological system development and function, and endocrine system development and function related to gastric cancer vs. control group. Color-coded: red, genes overexpressed; green genes, subexpressed; the intensity of the color indicates the degree of over-or subexpression. Encoding form: rectangle, nuclear receptor ligand-dependent; oval, transcription regulator; rhombus, enzyme; circle, other. Table S1: List of genes with differential expression of the comparison between chronic atrophic gastritis and the control, Table S2: Gene set enrichment analysis in chronic atrophic gastritis compared with control group, Table S3: Gene set enrichment analysis in gastric cancer compared to the control group, Table S4: Gene set enrichment analysis in gastric cancer vs. chronic atrophic gastritis, Table S5: Immunohistochemistry analysis in paraffin embedded tissues.