1. Introduction
Molecular biology and epigenetics are currently in the spotlight in the investigation of esophageal and esophagogastric junction oncogenesis [
1]. Esophageal cancer (EC) is still the seventh most common cancer worldwide with estimated 604,100 new patients in 2020, ranking sixth in overall mortality accounting for 544,076 new deaths as per GLOBOCAN 2021 [
2]. Despite the so-far notable advances achieved in earlier diagnosis and multimodal treatment, the prognosis for EC remains poor, with a 5-year survival rate of 19% [
3]. Tumor recurrence, metastasis, and resistance to chemoradiotherapy are major contributing factors to poor survival outcomes [
4,
5].
The development of EC is a multifactorial process and comes as a consequence of not only environmental and genetic factors but also of specific tumor behavior characteristics, which may vary among EC patients. In an effort to better predict disease trajectory overtime, scientific communities such as the American Joint Committee on Cancer (AJCC) and the Union for International Cancer Control (UICC) have incorporated certain prognostic factors to aid the staging efforts, and thereby the risk-assessment process, on disease recurrence and metastasis and ultimately the estimates on survival [
6].
Histopathologic cell type is the cornerstone in the staging and risk-assessment process in EC. The two most common histologic subtypes, esophageal squamous cell carcinoma (ESCC) and esophageal adenocarcinoma (EAC), vary substantially in terms of aetiopathogenesis, genetic susceptibility, clinical features, and prognosis, as well as gender and geographic distribution. Approximately half of EC cases are EAC in Europe, Oceania, and some Western countries, including the United States [
7], whereas ESCC remains the dominant type in other areas of the world, particularly in Asia and Africa. A male predominance is observed worldwide in EC and Gastric Cancer (GC), with male-to-female ratios of 6.7:1 for EAC, 3.3:1 for ESCC, and 4:1 for GC [
8]. While ESCC is essentially in decline, EAC incidence rates have escalated rapidly over the past decades [
9]. In addition to histology type [
10], other prognostic factors of major significance in predicting disease evolution include histologic grade of differentiation (G status) for EAC, tumor location and length for ESCC, infiltration potential in terms of Perineural Invasion (PNI), Lymphovascular Invasion (LVI) and Perivascular Invasion (PVI) Status, presence or absence of Signet Ring Cell (SRC), and Intestinal or Diffuse histological subtype in gastric adenocarcinoma, as per Lauren Classification [
11].
Along with these prognostic factors, which are directly related to final histopathologic characteristics of the primary tumor, preoperative serum tumor markers in the form of Carcinoembryonic Antigen (CEA) and Carbohydrate antigen 19.9 (Ca19.9) have also been proven useful in diagnosis, guiding management, and predicting response to treatment and survival in EC [
12].
Long non-coding RNAs (LncRNAs) are transcription products longer than 200 nucleotides that do not participate in protein expression but regulate gene expression at epigenetic, transcriptional, and post-transcriptional levels, thus influencing processes such as cell growth, apoptosis, and protein activity regulation [
13]. Aberrant expression of lncRNAs is associated with cellular malignant potential [
14]. When studied for colorectal adenocarcinoma, lncRNAs were found to be related with tumor size, histological subtypes, differentiation grade, Dukes staging, lymph node (LN) involvement, distant metastasis, disease-free survival (DFS), and overall survival (OS) [
15]. POLR2E was detected among gene expression profile pathways in bladder cancer patients in Egypt [
16]. Associations between EC and the most common genetic variants as lncRNA Single-Nucleotide-Polymorphisms (SNPs) have been detected in Asian studies [
17,
18]. Abnormal expression of HOTAIR in digestive cancers has recently been correlated with histopathological variables such as G status, indicating that HOTAIR may act as a prognostic biomarker to predict survival in various types of cancers such as GC [
19] and ESCC [
20]. Zhang et al. [
21] investigated the clinical role of HOTAIR expression as a prognostic indicator in digestive cancers by conducting a meta-analysis in 2018. When the authors compared poor versus well differentiated tumors in a total of five studies encompassing 386 patients, G status was found to be positively associated with high HOTAIR expression (OR: 1.65,
p = 0.040), suggesting it could serve as a novel prognostic biomarker in patients with digestive cancers.
Nevertheless, the mechanism of lncRNAs and their polymorphisms in esophagogastric cancer susceptibility has only been sporadically studied. Based on previous research, we intended to explore the potential effects of HOTAIR rs920778, LINC00951 rs11752942, POLR2E rs3787016, and HULC rs7763881 in histopathological and laboratory risk factors and thus their prognostic significance in an EC population of European/Greek descent.
2. Materials and Methods
2.1. Study Design
This tertiary referral hospital-based case-control study was designed according to ‘Strengthening the Reporting of Observational Studies in Epidemiology’ STROBE Guidelines (
Supplementary Material: Table S1) [
22]. The developed research protocol was strictly followed by all participating authors/researchers. Approval was obtained prior to the start from the Institutional Review Board of Laiko General Hospital and Ethics Committee of School of Medicine-National and Kapodistrian University of Athens (NKUA), Greece (IRB no: 18.01.2018/24), including both case and control populations. All performed procedures were consistent with the ethical standards of the Helsinki Declaration 1964 and later versions. All participants were consented prior to enrollment.
The study was conducted over a nine-year period, with the recruitment phase set between 25 March 2014 and 25 September 2018 and the follow-up phase with an end-date of 30 June 2023 for all recruited subjects. Two independent authors (EB, MB) extracted the data retrospectively from our prospectively collected Upper GastroIntestinal (UGI) Cancer database, including data from theatres, surgical/medical records, electronic/paper notes from inpatient and outpatient visits, and investigations performed in both public and private sectors. Discrepancies in data extraction were resolved after consensus with a third independent author (AM).
2.2. Patient Selection: Inclusion-Exclusion Criteria
All consecutive adults who underwent surgery for histologically-confirmed malignancy involving the middle-third or lower-third part of the esophagus, or the esophagogastric junction (EGJ) (Siewert I–III) [
23] at the Department of UGI Surgery, Laiko General Hospital, School of Medicine—NKUA, Greece, were deemed eligible for inclusion. The clinical and pathological staging, as well as all the included definitions, conform with the AJCC/UICC Guidelines—TNM 8th edition [
6]. As such, cancers crossing the EGJ with their epicenter in the proximal 2 cm of the stomach (EGJ-Siewert II) were staged and treated as EC, whereas cancers crossing the EGJ with their epicenter in the proximal 2 to 5 cm of the stomach (EGJ-Siewert III) were staged and treated as GC. All patients were risk-assessed and clinically staged with physical examination, tumor markers, computed tomography, and gastroscopy. At the time of diagnosis, all were evaluated by our dedicated Cancer Multidisciplinary Team, which formulated the appropriate multimodal treatment strategy as per international National Comprehensive Cancer Network (NCCN) [
24] and European Society for Medical Oncology (ESMO) [
25] guidelines. Exclusion criteria were: (a) non-adults, (b) cancer located on the cervical esophagus, (c) patients submitted to emergency surgery, (d) esophagogastric malignancy family history.
Our control group comprised community subjects recruited from the Department of Molecular Biology, School of Medicine, NKUA, Greece, with no self-reported history of cancer at any site. Both cases and controls were unmatched, of European/Greek ancestry, and resided in the geographical region of Greece.
2.3. Data Extraction: Primary-Secondary Variables of Interest
Our study endpoints were to identify potential associations of four lncRNAs’ polymorphisms (HOTAIR rs920778, LINC00951 rs11752942, POLR2E rs3787016, and HULC rs7763881) with histopathological (primary endpoint) and laboratory (secondary endpoint) prognostic markers in esophagogastric cancer in a western population such as Greece. To this end, our histopathological variables of interest encompassed histologic subtypes as EAC and ESCC, grade of differentiation (Gx-3 status), Perineural Invasion (PNI), Lymphovascular Invasion (LVI) and Perivascular Invasion (PVI) Status, presence or absence of Signet-Ring-Cell (SRC), and Intestinal or Diffuse subtypes in EAC subpopulation. Additionally, our laboratory variables of interest encompassed preoperative serum levels of tumor markers in the form of Carcinoembryonic Antigen (CEA) and Carbohydrate antigen 19.9 (CA19.9) in the whole EC cohort. CEA defined as Positive >5 ng/mL, whereas Ca19.9 defined as Positive >37 U/mL.
Further data extracted from our UGI Cancer Database were: (1) demographics such as age, gender, and surgical candidates’ preoperative health as per American Society of Anesthesiologists’ classification (ASA I-V) [
26]; (2) primary tumor location, neoadjuvant chemoradiotherapy if offered; (3) date/type of surgical operation, lymphadenectomy extent; (4) final histopathological characteristics as tumor size and location, LN harvest and infiltration, AJCC stage, neoadjuvant treatment effect, proximal, distal (R1–3), and circumferential resection margins (CRM, mm) as per World Health Organization (WHO) and College of American Pathologists (CAP) recommendations [
27]; (5) (%) minor and major complications (90-days), in-hospital mortality (90-days), follow-up length (months), adjuvant treatment where applicable, date/type of recurrence, disease-free survival (DFS, months), and overall survival (OS, months). The complications’ severity was based on Clavien–Dindo classification system [
28], with minor complications defined as Grade <II and major as Grade >IIIa. Recurrence date was set as the date of the first investigation documenting the recurrence/metastasis. DFS was defined as the period from the surgery date and the first recurrence date. OS was defined as the period between operation date and patient’s death.
2.4. Sample Collection and Preparation for Genetic Analysis
The most senior surgeon (TL) supervised all surgical operations performed during the recruitment period. Surgical tissue specimens for all enrolled patients were transferred, after completion of the surgical operation, to the First Department of Pathology, School of Medicine, NKUA, Greece. After gross pathologic examination and marking of the margins, each specimen was formalin fixed and paraffin embedded. Standard fixation methods to preserve nucleic acid integrity were used, including 10% neutral-buffered formalin fixed for 24–72 h. Once paraffin embedded, the tissue samples were then sectioned with a microtome and placed on a glass slide to formulate microscopic slides ready to be microscopically examined by the pathologists. When necessary, the embedding process was reversed to remove the paraffin wax out and allow for staining of the sections, as with Hematoxylin and Eosin (H&E) staining. All tissue samples were reviewed by two independent pathologists. The most senior pathologist (ACL) assessed all the microscopic slides for each specimen and selected the slide and its corresponding FFPE tissue block with the highest tumor burden in preparation for nucleic acid extraction.
2.5. Genotyping of HOTAIR rs920778, LINC00951 rs11752942, POLR2E rs3787016 and HULC rs7763881
The nominated FFPE tissue blocks with the highest tumor burden were subsequently transferred to Molecular Biology Laboratory, School of Medicine, NKUA, Greece. The percentage of tumor cells in each sample was minimum 50%. One to two −1 mm diameter punches were sampled from the FFPE blocks. The punches were deparaffinized, homogenized, and proteinase K digested. Next, the genomic DNA/RNA extraction was performed using a commercial RNA Extraction Kit from FFPE Samples (NucleoZOL, Macherey-Nagel, Düren, Germany). LncRNAs genotypes were identified through the “polymerase chain reaction-restriction fragment length polymorphism” (PCR-RFLP) or allele specific PCR (AS-PCR) depending on the SNP.
The following oligonucleotide primers and enzymes were used for PCR-RFLP: For HOTAIR rs920778 (C>T): Forward-5′-TTACAGCTTAAATGTCTGAATGTTCC-3′ and Reverse-5′-GCCTCTGGATCTGAGAAAGAAA-3′ with Restriction endonuclease MspI. For POLR2E rs3787016 (T>C): Forward-5′-CATCAACATCACGCAGCACG-3′ and Reverse-5′-CCCTGTCCTCCAAGCACTCAT-3′ with NLaIII restriction site. The following oligonucleotide primers were used for AS-PCR: For LINC00951 rs11752942 A>G: Forward-5′-GGGGCAAGAAGGTCAATA-3′, Forward-5′-GGGGCAAGAAGGTCAATG-3′ and Reverse-5′-GGGAATCTGCTGGGCT-3′. For HULC rs7763881: Forward-A: 5′-TGTAGTTCCAGTTTGTCTGAA-3′, Forward-C: 5′-TGTAGTTCCAGTTTGTCTGAC-3′ and Reverse: 5′-TGAACAAGTTGGTTGATCTTTAGC-3′.
The most senior molecular biologist (MG) designed and supervised the experiments as per the published methodology in [
29,
30].
2.6. Statistical Analysis
Descriptive statistical analysis was conducted for all the encountered parameters, measuring the accumulated values. All variables are reported as means and medians with their corresponding standard deviations, ranges, and proportions. We assessed the relationship between lncRNAs’ gene polymorphisms and EC and EAC cancer susceptibility by determining the genotype and allele frequencies for all variables of interest in both cases and controls. Genotype frequencies were compared using Fisher’s exact test with Yate’s continuity correction. Odds ratios (ORs) and 95% confidence intervals (95% CI) were calculated using the approximation of Woolf. To summarize the ORs of the four polymorphisms, we applied five genetic models: allele contrast, homozygous, heterogeneous, dominant, and recessive models (AA, homozygotes for the common allele; AB, heterozygotes; BB, homozygotes for the rare allele). Survival analysis was performed by using Cox proportional hazards models both for continuous and categorical variables and log-rank tests for continuous variables. The probability p-values were two tailed and p < 0.05 was adopted as the statistically significant level. Censoring date was 30 June 2023. When investigated for conformity with the Hardy–Weinberg equilibrium, we observed no significant deviation from expected numbers in all participating cases. Statistical analysis was performed using R, version 4.0.4 (R Project for Statistical Computing).
4. Discussion
Several risk-assessment models have been designed in an effort to decipher esophageal cancer heterogeneity and contain its unpredictability to ultimately guide a targeted curative treatment and accurately predict outcomes. The Tumor-Node-Metastasis (TNM) classification system is the most widely accepted risk-assessment modality to classify esophageal malignancy and thereby assist in prognostic cancer staging. The most used TNM classification system in the East is the ‘Japanese Classification of Esophageal Cancer’ by the Japanese Esophageal Society (JES) [
31], whereas in the West it is the ‘TNM Cancer Staging Manual’ by the AJCC/UICC.
In 2021, Ozawa et al. [
32] demonstrated that for all patients included in their study (of note, 93% of whom were ESCCs), the AJCC 8th edition staging system tended to reflect survival more precisely than that of the JES 11th edition, particularly for lower thoracic esophageal tumors. While the main difference between these two prognostic systems is in the definition of the LN stage [
33], there are also few notable differences on how these systems utilize prognostic factors such as G and LVI Status and incorporate them in their risk-assessment prognostic models. Despite taking into consideration these parameters, the disease volatility still cannot be exclusively explained or controlled, suggesting that genetic footprint may also contribute to malignant transformation of esophageal epithelium; as such, its role warrants reinvestigation [
34].
Emerging literature advocates that lncRNAs and SNPs occurring in their functional region influence the pathophysiology of esophageal oncogenesis [
13,
35]. Yet, the results obtained so far have been controversial, inconclusive, or limited by small sample-sizes [
36]. Moreover, most of the published research investigating this association of lncRNA SNPs in esophageal carcinogenesis has been restricted in Eastern ethnicities where ESCC subtype predominates [
37]. Based on current epidemiological evidence, we hypothesized that similar lncRNA SNPs may also interplay in EAC aetiopathogenesis in a western subpopulation such as Greece. To test this hypothesis, we explored the incidence of four lncRNA SNPs on EC surgically treated patients and healthy controls of European/Greek ancestry. We further sought to evaluate the underlying molecular basis for histopathological and laboratory prognostic cancer risk markers by ascertaining the SNPs frequency in these subgroups by conducting subset statistical analysis.
HOX transcript antisense RNA (HOTAIR) is a 2.2-nucleotide lncRNA located in chromosome 12q13.12, transcribed from the homeobox C gene (HOXC) locus [
38]. Accumulating research is drawing attention to correlation between HOTAIR’s SNPs and the risk for various cancer types but the results obtained so far have been equivocal [
39]. In 2019, in Tian’s review, comprising 107 meta-analyses and 6 genome-wide association studies, HOTAIR rs920778 was rated as strong evidence of true association with ESCC risk for the T allele, yet all included studies were performed on a single ethnic group (Asian) [
40]. Conversely, in 2020, Minn et al. [
41] concluded that HOTAIR rs920778 did not contribute, either overall or by type cancer incidence, in Japanese population. Taking into account previous evidence, we performed a case-control study analyzing the distribution of HOTAIR rs920778 genotype frequencies in both EC and healthy controls which yielded not significant over-presentation in our EC population in terms of EAC versus ESCC, PNI, LVI, and PVI prognostic variables as opposed to studies implicating HOTAIR with LVI in cancers such as cervical [
42,
43]. In line with Zhang et al. [
21], HOTAIR was significantly associated with G-status in the whole EC cohort, while in EAC T allele was significantly different in Gx–G1 versus controls and G2–G3 versus G1-Gx subgroups. The latter ORs were ambiguous possibly due to the small sample size of the Gx–G1 subgroup and therefore this finding needs to be cautiously taken into consideration. While we found no association with Intestinal or Diffuse types in our cohort, as revealed by Petkevicius et al. in Lithuanian gastric cancer subjects in 2022 [
44], T allele was significantly overrepresented in the SRC positive patients when compared with the cancer-free controls. Additionally, while we demonstrated no correlation between HOTAIR SNP and CEA, we identified increased frequency of both CT and T genetic variants in Ca19.9 positive EC patients when compared with both the healthy and the Ca19.9 negative patients, indicating that they may share a common genetic pathway with increased susceptibility for esophagogastric cancer or more dismal prognosis.
LINC00951 is a lncRNA located in chromosome 6p21.2, informally studied as lincRNA-uc003opf.1. A variant genotype of rs11752942 in linc-RNA-uc003opf.1 exon has been reported to be associated with cancer risk. The rs11752942 A>G (G/A) may affect cell proliferation and tumor growth, thereby promoting the susceptibility of ESCC as per Wu et al.’s [
45] genotyping results among 52 studied SNPs. LINC00951 rs11752942 was also revealed to be related to head and neck cancers’ incidence in adults [
46] as well as in neuroblastoma incidence in children [
47] in Asia. Taking these into consideration, we conducted a case-control study to determine possible association between this polymorphism and EC/EAC risk in Greek population. As opposed to studies of Asian background, our analysis investigating the molecular effects of the LINC00951 polymorphism in both the histopathological and laboratory prognostic markers of interest did not uncover significant statistical evidence between the rs11752942 and cancer susceptibility in any of the genetic models AG/AA, GG/AA, and G/A alleles. However, in logarithmic transformation for CEA and Ca19.9, the GG variant was uncovered significantly less frequently in Ca19.9 elevated patients, implying it may affect protectively not only the esophagogastric cancer prognosis but also the molecular behavior of other Ca19.9 producing malignancies [
48].
SNP rs3787016 (A>G or its complementary T>C, C/T), localized in the fourth intron of the RNA polymerase II subunit E (POLR2E) lncRNA gene, has been implicated with cancer susceptibility either by predisposing for GC [
49], breast and cervical cancer [
50], or by protecting against ESCC [
51] in Chinese populations. As no study to date has explored the molecular impact of POLR2E rs3787016 in histopathological and laboratory prognostic factors in EC/EAC in the west, we conducted a case-control study in a population of Greek/European ancestry. Our analysis by histological subtype yielded that CC and C allele carriers were significantly higher in ESCC and lower in EAC patients, suggesting that it may pose a risk prognostic factor for the former and protective factor for the latter. These genetic variants were also significantly less frequent in the PNI, LVI, and PVI positive EAC subsets compared with the healthy controls as well as in the LVI positive whole EC cohort. Furthermore, in our subgroup analysis, assessing the molecular basis of the SRC and Diffuse/Intestinal prognostic cancer factors, C allele was also underrepresented in the positive groups. CT variant was found significantly more frequently within the CEA elevated patients, implying it may affect not only the esophagogastric cancer prognosis but also the molecular behavior of other CEA producing malignancies in a similar genetic pattern as KIF26B non-coding RNA in colon cancer [
52].
While Kang et al. [
51] demonstrated that HULC rs7763881 was a protective prognostic factor against ESCC among male younger patients, Hong et al. [
53] suggested an increased GC susceptibility-both studies conducted in Chinese populations. The hepatocellular carcinoma up-regulated lncRNA (HULC) gene is located in chromosome 6p24.3 with two exons and 1638 bp length. Given the literature’s contradictory results, we conducted our case-control study to explore its prognostic significance in EC/EAC genetic footprint in a western ethnicity. Compared with the previous studies, our analysis, investigating the molecular effects of the HULC polymorphism in both the histopathological and laboratory prognostic markers of interest in terms of EAC/ESCC subtypes, PNI, LVI, PVI, G Status, SRC, and Intestinal versus Diffuse subtypes, as well as CEA and Ca19.9, revealed no association in any of the genetic models AC/AA, CC/AA, and C/A alleles. This could be explained by EAC’s predominant subtype prevalence in our cohort or may signify that a true different genetic footprint needs to be confirmed by additional future studies in the west.
Certain limitations apply to this research article. Since it was a hospital-based case-control study with large majority of EC cases and healthy controls from the Attica Region, inherent selection bias may have occurred. The statistical strength of this case-control study may also be limited by the sample size, particularly with respect to the statistical analyses of the subgroups positive or negative to the prognostic factors of interest where the smaller sizes may have impacted the data credibility. We sought to overcome these small-study effects by comparing all the SNPs distributions of the subgroups with our control sample (n = 121) as well as by combining statistical tests to further corroborate our statistically significant results or the trends observed throughout the report. Finally, despite this case-control study being retrospective by definition, all data were extracted from our prospectively collected UGI cancer database following predetermined research protocol to ensure appropriate methodology. Additional study-strengths were the follow-up length with high case-ascertainment enabling us to perform our correlations between SNPs and oncological outcomes such as tumor progression, metastasis, and overall survival.
5. Conclusions
In conclusion, HULC rs7763881 was not detected differently in any of the EC prognostic subgroups compared with the healthy community subjects. LINC00951 rs11752942 GG variant was significantly underrepresented in Ca19.9 elevated patient subgroup indicating it may serve as a prognostic marker with protective potential not only for esophagogastric cancer but also for other Ca19.9 secretory malignancies. HOTAIR rs920778 TT and T genotypes were significantly associated with prognostic factors as G differentiation grade and SRC status, whereas CT and T genotypes with Ca19.9 elevated patient subgroup suggesting it may serve as a potential therapeutic suppression target against esophagogastric cancer in addition to estimate prognosis in Ca19.9 secretory malignancies. Regarding POLR2E rs3787016, CC and C genotypes were significantly correlated with histological subtypes such as ESCC, EAC, SRC and Diffuse, as well as with prognostic variables in the form of PNI, LVI, and PVI, whereas CT variant was associated with CEA. This indicates that it may be able to evaluate esophagogastric cancer predisposition and predict response to treatment and prognosis in CEA secretory malignancies in the future.
Overall, the present study demonstrates that lncRNAs’ LINC00951, HOTAIR and POLR2E polymorphisms may genetically influence and as such, may explain a fraction of EC and EAC molecular basis. Implementation of these genetic models as part of the clinical and pathological risk-assessment process may add to the efficiency and efficacy of the current utilized prognostic models. Prospective multicenter studies with larger sample-size are required to validate these findings.