Performance of Salivary Extracellular RNA Biomarker Panels for Gastric Cancer Differs between Distinct Populations

Simple Summary Gastric cancer (GC) is the fourth most common cancer that occurs worldwide, affecting specifically the Asian population. Currently, there are no available screening programs for GC in United States. Since saliva is a highly desirable body fluid for developing biomarkers of cancer screening, early detection, and monitoring, we previously reported that salivary extracellular RNAs could be developed to detect gastric cancer in a Korean cohort, and here, we validate them in a U.S. cohort. Our study emphasizes the importance of population-specific biomarker development and validation, and specifically, the noninvasive nature of salivary biomarkers for population-based screening in at-risk populations. Abstract Gastric cancer (GC) has the fifth highest incidence among cancers and is the fourth leading cause of cancer-related death GC has predominantly a higher number of cases in certain ethnic groups such as the Korean population. GC found at an early stage is more treatable and has a higher survival rate as compared with GC found at a late stage. However, a diagnosis of GC is often delayed due to the lack of early symptoms and available screening programs in United States. Extracellular RNA (exRNA) is an emerging paradigm; exRNAs have the potential to serve as biomarkers in panels aimed at early detection of cancer. We previously reported the successful use of a panel of salivary exRNA for detecting GC in a high-prevalence Korean cohort, and that genetic changes reflected cancer-associated salivary exRNA changes. The current study is a case-control study of salivary exRNA biomarkers for detecting GC in an ethnically distinct U.S. cohort. A model constructed for the U.S. cohort combined demographic characteristics and salivary miRNA and mRNA biomarkers for GC and yielded an area under the receiver operating characteristic (ROC) curve (AUC) of 0.78. However, the constituents of this model differed from that constructed for the Korean cohort, thus, emphasizing the importance of population-specific biomarker development and validation.


Introduction
Gastric cancer (GC) is an aggressive type of cancer that remains a healthcare burden worldwide [1]. For 2022, the American Cancer Society has estimated that there will be about 26,380 new cases of gastric cancer in the USA and about 11,090 deaths (https://seer.cancer.gov/ statfacts/html/stomach.html (accessed on 10 May 2022)). Gastric cancer accounts for~1.5% of all new cases of cancers in the USA each year. The incidence of GC in the United States has relatively decreased, however, in Western countries, approximately half of patients present with locally advanced or metastatic GC at diagnosis, and an additional 40% to 60% of those patients undergoing resection of gastric adenocarcinoma relapse after surgery [2]. Thus, early detection of this type of cancer is the main goal to reduce mortality, as the 5-year survival rate of early detected cases can reach >95% [3].
Many studies from East Asian countries have shown that screening methods, especially endoscopic screening for detecting early-stage GC, have resulted in a reduction in mortality. However, population-based screening programs do not exist in the USA, because of the low incidence of GC overall [4].
Although upper gastroesophageal endoscopy with targeted and random biopsies remains to be the gold standard for the detection of GC, other screening tools have been implemented in high-risk countries such as pepsinogens (PGs) including PG 1 and PG 2, gastrin-17, and Helicobacter pylori (H. pylori) IgG antibody tests. Blood-based tumor markers such as carcinoembryonic antigen (CEA) and carbohydrate antigen 19-9 (CA19-9) have also been used for the detection of GC, but have low sensitivity and specificity for early-stage disease [5,6].
Screening programs for GC vary among countries, depending on prevalence and costeffectiveness [6]. We previously reported the use of extracellular RNA (exRNA) biomarkers in saliva as a diagnostic tool for screening and/or risk assessment for GC [7]. In a study of a Korean cohort of subjects, we identified 30 mRNA and 12 miRNA biomarkers that were associated with the expression pattern and presence of GC. A configured biomarker panel consisted of three mRNAs (SPINK7, PPL, and SEMA4B) and two miRNAs (miR-140-5p and miR-301a-3p) that were all significantly downregulated in the GC group, and yielded an area under the receiver operating characteristic (ROC) curve (AUC) of 0.81 (95% CI 0.72-0.89). When combined with demographic factors, the AUC of the biomarker panel reached 0.87 (95% CI 0.80-0.93) in differentiating subjects with GC from those without cancer. Since a lower expression of these salivary markers was indicative of GC, a comprehensive cut-off validation study would be necessary to develop these markers for the screening of GC in the general population.
It is known that the pathogenesis of GC depends on multiple etiological factors and ethnicity could obviously be one of determining factors [8]. However, it is also possible that there are common biological alterations that may contribute to the pathogenesis of disease and these genetic changes may be cancer-associated salivary exRNA alterations. The miRNAs and exRNAs are endogenous, small non-coding RNA molecules that posttranscriptionally modulate gene expression [9]. Because these molecules are stable in different body fluids including saliva, analysis of these molecules can lead to an important, noninvasive diagnostic and prognostic tool for GC screening. However, the expression of biomarkers may differ based on the population investigated. A key requirement of biomarker validation for clinical and regulatory purposes is that of intended use. A biomarker or panel of biomarkers that may differentiate disease from normal in one population may not perform similarly in a different population. While we have previously demonstrated that a particular panel of salivary biomarkers may differentiate subjects with GC from those without cancer in an Asian population with a high prevalence of gastric cancer, it is unclear whether this same panel of markers would perform similarly in a low-prevalence U.S. population. The current study represents a case-control study of salivary exRNA biomarkers in a U.S. cohort.

Saliva Collection and Processing
Unstimulated whole saliva was collected from 50 newly diagnosed treatment-naive patients with histologically proven GC (stages I-IV) and 50 control subjects without GC, based on recent endoscopic results at the University of Texas MD Anderson Cancer Center, USA, using standard operating procedures (SOPs) developed for our prior study of a Korean cohort [7,10]. Subjects were asked to avoid oral hygiene measures, eating, drinking, and gum chewing at least 1 h prior to saliva collection. The subjects rinsed with tap water (10 mL) for 30 s about 10 min prior to saliva collection and expectorated. Clinical samples were collected in sterile tubes, lasting 5-10 min per collection (at least 5 mL of saliva), and kept on ice through the entire process. All samples were processed, approximately 1 h after collection. First, samples were centrifuged in a refrigerated centrifuge at 2400× g for 15 min at 4 • C, and the supernatant was processed immediately for the concurrent stabilization of proteins and RNA by the inclusion of a protease inhibitor cocktail (aprotinin, 3-phenylmethylsulfonyl fluoride (3-PMSF) (Sigma-Aldrich, St. Louis, MO, USA), sodium orthovanadate (Na3VO4) (Sigma-Aldrich, St. Louis, MO, USA)) and RNase inhibitor (Invitrogen SUPERase·In RNase Inhibitor, Thermo Fisher Scientific, Austin, TX, USA) based on our saliva standard operating procedure (SOP) [11]. These samples were aliquoted into smaller cryo-vials, labeled, and frozen at −80 • C.
This study, including the patient consent process, was approved by the Institutional Review Board for Human Studies at the University of Texas MD Anderson Cancer Center (IRB number PA17-0583). The control group consisted of subjects undergoing upper endoscopy for dyspepsia or gastroesophageal reflux-like symptoms and documented to have no neoplasia. Patient-level clinical demographics were obtained (age, gender, ethnicity, smoking history, staging, and diagnosis). The study was performed from 19 October 2017 to 13 June 2019.

RNA Isolation from Saliva Samples
Total RNA from 50 blinded GC subjects and 50 non-GC control subjects was isolated by using a Qiagen miRNeasy Micro kit (Qiagen, Germantown, MD, USA). The 250 µL samples of cell-free saliva was used to isolate total RNA using a modified protocol successfully used in the lab for isolating salivary RNA [10]. The final RNA was eluted in 14 µL of water.

Validation of miRNA GC Markers
The biomarker panel used in this study contained two miRNAs (miR-140-5p and miR-301a-3p); U6 snRNA and miR-197 were used as the reference genes. TaqMan miRNA assays (Thermo Fisher Scientific, Austin, TX, USA), containing these four small RNA genes, were ordered from Applied Biosystems (Foster City, CA, USA). The protocol was similar to that recommended by the manufacturer for creating custom reverse transcription (RT) and preamplification primer pools using TaqMan MicroRNA Assays (Thermo Fisher Scientific, Austin, TX, USA). Total RNA (3 ng) was converted to cDNA using a TaqMan MicroRNA Reverse Transcription Kit (Applied Biosystems, Foster City, CA, USA). After RT, the product was preamplified using SsoAdvance PreAmp Supermix (Bio-Rad, Hercules, CA, USA) and preamplification primer pool. The preamplified product was diluted 2 times prior to miRNA quantification. The qPCR reactions for each candidate miRNA were performed in triplicate on a Roche LightCycler 480 II (Roche, San Francisco, CA, USA). The average threshold cycle (Cq) was examined and U6 snRNA and miR-197 were used as the reference genes for normalizing the data.

Validation of mRNA GC Markers
Three selected candidate mRNA biomarkers (3 mRNAs (PPL, SEMA4B, and SPINK7) as well as 2 reference genes (GAPDH and ACTB)) generated by microarray profiling were validated by nested real-time quantitative polymerase chain reaction (RT-qPCR) (RT-PCR followed by a separate SYBR green quantitative polymerase chain reaction (qPCR)) on the new set of samples from MD Anderson Cancer Center (blinded 50 GC and 50 non-GC). The gene accession numbers and primer sequences used for the transcriptomic biomarker validation are shown in Supplementary Materials Table S1. The qPCR primers were designed using the Primer3 software and synthesized by Sigma-Genosys after performing a Primer-BLAST search. The primer sequences were designed to avoid any known singlenucleotide polymorphism region in the target gene. All the amplicons were intron spanning. The RT-qPCR assay followed the Minimum Information for Publication of Quantitative Real-Time PCR Experiment guidelines and was performed in duplicate with each biomarker candidate. The specificity of the PCR product for each gene was confirmed with melting curve analysis and 3% agarose gel analysis.

RT-qPCR Preamplification for Validation of mRNA Candidates
The multiplex RT-PCR preamplification was performed with an Invitrogen SuperScript III Platinum One-Step qRT-PCR System (Thermo Fisher Scientific, Austin, TX, USA) with a pool of outer primers at 100 nM each. The reaction mixture was prepared on ice, and then loaded into the preheated thermocycler. The amplification was performed as follows: 2 min at 60 • C; 30 min at 50 • C; 2 min at 95 • C; and 15 cycles of 15 s at 95 • C, 30 s at 50 • C, 10 s at 60 • C, and 10 s at 72 • C; with a final extension for 10 min at 72 • C and cooling to 4 • C. Immediately after RT-qPCR, 10 µL of the reaction was treated with 4 µL of Exo-SAP-IT (Thermo Fisher Scientific, Austin, TX, USA) for 15 min at 37 • C to remove excess primers and deoxynucleotide triphosphates (dNTPs), and then heated to 80 • C for 15 min to inactivate the enzyme mix. The preamplified complementary DNAs (cDNAs) were then diluted by adding water to 200 µL (20-fold) to enable the qPCR of all targets.

qPCR for Validation of mRNA Candidates
Singleplex qPCR was performed in 10 µL reactions with 2 µL of each preamplified cDNA sample and the inner primers at 200 nM each. The reaction was conducted with a SYBR Green I Master mix in LightCycler 480 (Roche Diagnostics, Indianapolis-Marion County, Indiana) instrument. After 10 min of polymerase activation at 95 • C, 40 cycles of 15 s at 95 • C and 60 s at 60 • C were performed, followed by melting curve analysis. Three controls including one RT control, no-template control, and positive control with universal human RNA were performed with every candidate on each sample.

Statistical Analysis for qPCR
The qPCR analyses were all done in triplicate. For the miRNA analysis, data were analyzed using the RQ Manager software version 1.2 and DataAssist software version 3.0 (Applied Biosystems). Similarly, the ∆Cq value was computed using RNA polymerase III transcribed U6 small nuclear RNA as the reference gene [7]. For the mRNA analysis, the ∆Cq of each biomarker candidate was calculated by subtracting the Cq value of the housekeeping genes (GAPDH and ACTB) from the raw Cq value in the same sample. ∆Cq values for mRNA and miRNA were compared between groups using the Wilcoxon rank-sum test.

Clinicopathological Characteristics of Patients
The patients' characteristics and study variables are summarized between groups (GC vs. control) using mean (SD) and frequency (%) and compared between groups using the two-sample t-test or chi-square test (Table 1).

miRNA RT-qPCR
Next, we constructed a model with the demographic terms (age, gender, and smoking history) plus the two candidate miRNA markers for GC (computing the dCT by subtracting the reference gene (U6) from our candidate markers (miR-140 and miR-301a)). This was the same reference gene (U6) used in our prior study [7]. In this study, we found the CT values for U6 were 15.37 ± 2.64 in the non-GC control group and 15.53 ± 2.70 in the GC patient group (p = 0.809 by t-test). The p-values show no significant differences between GC patients and non-GC controls, suggesting U6 is a good reference gene for salivary exRNA quantification. To reduce the potential bias from one reference gene, we also tested miR-197 as an extra reference small RNA. We found the CT values for miR-197 were 16.20 ± 1.82 in the non-GC control group and 16.19 ± 1.85 in the GC patient group (p = 0.796). Next, we compared the AUC between the model with only demographic factors to the model utilizing demographic and miRNA data using the DeLong's test (Table 2). Analyses were conducted using IBM SPSS V25 (Armonk, NY, USA) and R V 3.6.1 (www.r-project.org (accessed on 20 March 2021), Vienna, AU, USA) and p-values < 0.05 were considered to be statistically significant. The AUC (95% CI) was 0.75 (0.65-0.84) for the GC group versus the non-GC group based on these two miRNA markers together with demographic factors. The markers (dCT) were both significant in that model (miR-140 (p = 0.003), miR-301a (p = 0.002)). Interestingly, the demographic model alone yielded an AUC of only 0.68, while the combined model (demographic data with miRNA biomarkers) resulted in an improved AUC of 0.75 (DeLong p-value = 0.129). Next, logistic regression models for GC status were constructed using demographic factors from our previous publication (age, gender, and smoking history) with the AUC (95% CI) and odds ratios (ORs) estimated (Table 3). Interestingly, the demographic factors in this U.S. cohort showed an AUC of 0.68 (95% CI 0.57-0.78), which was similar to the AUC of 0.69 (95% CI 0.59-0.79) in the Korean cohort that we previously reported [7].

mRNA RT-qPCR
We constructed a new model for the U.S. cohort using the same variables in three different ways, as reported in our previous report based on a Korean cohort [7]:  Table 4). Table 4. Demographic characteristics with miRNA and mRNA biomarkers for gastric cancer.  However, when we applied the coefficients as estimated from the prior Korean cohort study [7], the AUC was only 0.52 because of differences in the significance of individual demographic features as well as in performance of GC miRNA and mRNA biomarkers in the current U.S. cohort as compared with the Korean cohort.

Demographic Features + 2 miRNA Biomarkers for GC + 3 mRNA Biomarkers for GC
Additionally, we also assessed the panel performance (demographic characteristics (demo) + miRNAs + mRNAs) in two separate scenarios defined as controls vs. early-stage GC (I, II) as well as controls vs. late-stage GC (III, IV). It appeared that the discrimination (AUC) of the control vs. early-stage GC model was 0.85 (0.72-0.99), whereas for the control vs. late-stage GC, the performance was 0.75 (0.64-0.85). Therefore, our panel may perform better in discriminating controls from early-stage GC, although this would need to be confirmed in a follow-up study (Table 3).

Discussion
A growing number of studies have demonstrated the utility of exRNA as a reliable noninvasive approach for diagnosis, therapy, and prognosis of cancers [12]. Extracellular RNAs have been explored as biomarkers in a number of different biofluids and types of cancer, which include esophageal squamous cell carcinoma (ESCC) [13], lung cancer [14], brain cancers [15][16][17][18], prostate cancer [19], pancreatic cancer [20], colon cancer [21], and gastric cancer [22]. As of 2020, 45 clinical trials, in the USA and numerous other countries, have been reported that have focused on the use of exRNA and exosomes as clinical biomarkers of cancer [12]. These clinical trials have explored exRNAs as clinical biomarkers of various cancer types including lung and prostate cancers. Blood is a primary source of exRNAs that have been tested, but studies have also investigated urine. Especially saliva is being explored as an emerging biofluid that is easy to collect, and has been shown to reflect the spectrum of health and disease states found using serum [23,24].
Standards for validation of biomarkers require that they be applied in the population for which they are intended to be used [25]. A biomarker or panel of biomarkers which can differentiate disease from normal in one population may not perform similarly in a different population. We previously reported on a panel of salivary biomarkers which, when combined with specific demographic factors, differentiated subjects with GC from those without cancer in an Asian population with a high prevalence of gastric cancer. Our aim was to evaluate the performance of salivary exRNA biomarkers for GC, which we previously discovered and validated in Korean GC patients [7], in a U.S. population. Previously, 12 mRNA and 6 miRNA candidates were verified with a discovery Korean cohort by RT-qPCR and further validated with an independent Korean cohort (n = 200). The configured biomarker panel consisted of three mRNAs (SPINK7, PPL, and SEMA4B) and two miRNAs (miR-140-5p and miR-301a), which were all significantly downregulated in the GC group, and yielded an AUC of 0.81 (95% CI 0.72-0.89). When combined with demographic factors, the AUC of the biomarker panel reached 0.87 (95% CI 0.80-0.93) [7]. In our prior study, demographic characteristics (including age, gender, and smoking) were all highly significant predictors of case status [7], while, in the current U.S. MD Anderson Cancer Center cohort, only gender was significant (Table 4). However, this U.S. cohort (MD Anderson Cancer Center cohort) had more ethnic diversity including Caucasian (51% of the GC group and 65.3% of the control group), Black non-Hispanic (19.6% of the GC group and 12.2% of the control group), and Hispanic (17.6% of the GC group and 16.3% of the control group) subjects, with the Asian population as the least prevalent group (11.8% of theGC group and 6.1% of the control group). In a previous study [7], Asians constituted 100% of the GC and control groups for both mRNA and miRNA discovery and validation phases ( Figure 2). In addition, in the current study, the samples were obtained from older patients (GC group, 61 years old and control group, 60 years old), fewer smokers (present or prior smoking, 43.1% of the GC group and 42.9% of the control group), and fewer males (62.7% of the GC group and 32.7% of the control group) as compared with the Korean cohort ( Figure 2) [7].
It is unclear how these factors may account for differences between the two distinct populations with respect to biomarker profiles. Interestingly, the prevalence of H. pylori positivity in the U.S. population with gastric cancer was relatively low as compared with Asian populations with gastric cancer where H. pylori was a major risk factor. There was no difference, however, in the prevalence of H. pylori between cancer patients and controls in the U.S. cohort, but the number of subjects studied was small. These demographic differences may be important factors to consider for validation of salivary exRNA GC biomarkers in two entirely independent patient cohorts (Korea vs. USA). Our study indicated that, overall, the demographic factors in this U.S. cohort were similar (AUC of 0.68) to those of the Korean cohort (AUC of 0.69) [7]. In our previous study with Korean subjects, we found a difference in nearly all the selected mRNAs (ANXA1, CD24, CSTB, ERO1A, KRT4, KRT6A, PPL, RANBP9, S100A10, SEMA4B, and SPINK7) and miRNAs for GC (miR-140-5p, miR-374a, miR-454, miR-15b, miR-28-5p, and miR-301a). They were all statistically significant (FDR-adjusted p-value <0.05). However, in the U.S. cohort, none of the miRNAs (miR-140 and miR-301) and mRNAs (SPINK7, PPL, and SEMA4B) performed similarly (Supplementary Materials Figure S1). Therefore, any model constructed from the prior Korean cohort could not be generalized to the current U.S. cohort [7]. The demographic characteristics of the Korean cohort used in our previous study [7] (Table 1 for the discovery phase and Table 2 for the validation phase) were compared with the demographic characteristics of the U.S. cohort used in this study (Table 3). (Tables 1 and 2  Interestingly, there was a statistically significant difference between a new model with only demographic characteristics (AUC = 0.68) and a new model with demographic characteristics, and miRNA/mRNA biomarkers for GC (AUC = 0.78) (Figure 1) (Delong's test p-value = 0.037), indicating that the model with both miRNAs and mRNAs together with demographic characteristics (Model 3) was much better than the model with demographic characteristics alone (Model 1). Interestingly, our panel has a potential to perform better in discriminating non-GC controls from early-stage GC (I, II) (AUC = 0.85 (95% CI 0.72-0.99)) as compared with the late-stage GC (III, IV) (AUC = 0.75 (95% CI 0.64-0.85)), although this still would need to be confirmed in a follow-up study. Thus, we were able to validate a panel of salivary exRNA biomarkers with credible clinical performance for the detection of GC in a U.S. population. Our study confirms, again, the potential utility of salivary exRNA biomarkers in screening and risk assessment for GC.

Limitations, Future Studies, and Advantage of the Markers Used in This Study
The abovementioned studies suggest salivary RNAs as potential biomarkers for the diagnosis of GC, but emphasize the need for validation in intended use populations. However, no study of adequate sample size for independent validation has been performed to date. There remains an unmet need to develop a noninvasive biomarker assay for identifying patients with GC from a high-risk population. Thus, the major advantage of the markers used in the current study is their noninvasive nature, which is important for population-based screening in at-risk populations.

Conclusions
We aimed to develop universal biomarkers for GC that could be applicable to all individuals regardless of their ethnic origin. Although we were unable to "validate" the prior model developed based on a Korean cohort [7], we were able to demonstrate that our markers had diagnostic utility above and beyond demographic factors alone. Additional studies are needed to evaluate the diagnostic utility of our models in different ethnic populations, such as a Korean cohort in a U.S. population. More importantly, our study emphasizes the importance of population-specific biomarker development and validation for salivary exRNA biomarker for GC detection.
Supplementary Materials: The following supporting information can be downloaded at: https:// www.mdpi.com/article/10.3390/cancers14153632/s1, Figure S1: Comparison of miRNA and mRNA biomarker performance for GC between Korean and the U.S. MD Anderson Cancer Center study groups, Table S1: The list of mRNA biomarker candidates and primer sequences used for validation.