Genetic Variants of HOTAIR Associated with Colorectal Cancer: A Case-Control Study in the Saudi Population

Genetic polymorphism in long noncoding RNA (lncRNA) HOTAIR is linked with the risk and susceptibility of various cancers in humans. The mechanism involved in the development of CRC is not fully understood but single nucleotide polymorphisms (SNPs) can be used to predict its risk and prognosis. In the present case-control study, we investigated the relationship between HOTAIR (rs12826786, rs920778, and rs1899663) polymorphisms and CRC risk in the Saudi population by genotyping using a TaqMan genotyping assay in 144 CRC cases and 144 age- and sex-matched controls. We found a significant (p < 0.05) association between SNP rs920778 G > A and CRC risk, and a protective role of SNPs rs12826786 (C > T) and rs1899663 (C > A) was noticed. The homozygous mutant “AA” genotype at rs920778 (G > A) showed a significant correlation with the female sex and colon tumor site. The homozygous TT in SNP rs12816786 (C > T) showed a significant protective association in the male and homozygous AA of SNP rs1899663 (C > A) with colon tumor site. These results indicate that HOTAIR can be a powerful biomarker for predicting the risk of colorectal cancer in the Saudi population. The association between HOTAIR gene polymorphisms and the risk of CRC in the Saudi population was reported for the first time here.


Introduction
Cancer, in general, is the second leading cause of death in the world, and colorectal cancer is in third place after lung and breast cancer. In Saudi Arabia, the occurrence of colorectal cancer is on top after breast cancer in females but is the topmost common cancer in males [1]. The mechanism involved in the development of CRC is very complicated and both genetic and environmental aspects are considered risk factors [2]. Although the exact cause behind it is still not known, its occurrence is associated with nonmodifiable risks such as age, gender, genetics, and some modifiable factors including environment and lifestyle [3][4][5]. In the human genome, single nucleotide polymorphisms (SNPs) are considered the major genetic variants which are used to predict cancer risk and prognosis [6,7]. Almost 10 million SNPs are reported with a frequency of 1 in 300 nucleotides in the genome. SNPs located in the genes that regulate metabolism, immunity, or cell cycle regulation are usually associated with genetic susceptibility to cancer [8]. Understanding the relationship between SNPs and cancer susceptibility points toward the molecular pathogenesis of various cancers. SNPs are probably considered prospective diagnostic and therapeutic biomarkers in cancer [8]. In a gene, SNPs can be located at promoter regions, exons, introns, or even at 5 -and 3 UTR, so they can alter the gene expression [9][10][11][12][13]. SNPs present in long noncoding RNAs (lncRNAs) are reported to be associated with cancer risk as they may change the structure and expression levels of lncRNA [14].
Generally, lncRNAs are RNA molecules of 200 nucleotides or more and mostly regulate protein expression and do not encode any protein. These molecules exist throughout the genome and interact with DNA, RNA, and protein to regulate protein expression through epigenetic, transcriptional, and post-transcriptional regulations [14][15][16]. As a key regulator of gene expression, the impairment of lncRNAs results in the prognosis, metastasis, and recurrence of different types of cancer. Recently, HOX transcript antisense intergenic RNA (HOTAIR) lncRNA, which is one of the most important regulatory RNAs in humans, has shown an association with cancer metastasis, chemotherapy responses, and the survival rate of patients [17]. Additionally, genetic polymorphism in HOTAIR is linked with the risk and susceptibility of human cancer [18]. Many studies have reported the abnormal expression of HOTAIR in cancer tissue. The overexpression of HOTAIR activates the gene-silencing pathways activated by modified histone protein. In cancer tissue, HOTAIR acts as a molecular decoy for microRNAs (miRNAs) and RNA-binding proteins (RBPs) and directly regulated the target mRNA [19]. HOTAIR suppresses miR-148a by competing with endogenous RNA in esophageal and epithelial cancer to promote the expression of Snail2 and to enhance cell invasion and metastasis via the epithelial-to-mesenchymal transition [20,21]. miRNA-34a is downregulated by HOTAIR in colon cancer and HOTAIR itself is upregulated in esophageal squamous cell carcinoma [22][23][24][25][26].
Studies have found worse prognosis in patients with upregulated HOTAIR in primary tumors or blood as compared to the patients with low HOTAIR expression, and thus scholars have proposed it as a potential biomarker [19,27]. In CRC cells, HOTAIR is found to reduce the expression levels of E-cadherin, which results in increased levels of vimentin and matrix metalloproteinase 9 (MMP-9) involved in the invasion and metastasis [28]. Several studies have reported the association between HOTAIR genetic polymorphisms and CRC risk [19,29,30]. Several SNPs of HOTAIR act as potential cancer susceptibility loci but no studies on the association of SNPs in the Saudi population has been reported. The present study reported three HOTAIR polymorphisms (rs920778 G > A, rs12816786 C > T, and rs1899663 C > A) to evaluate the association between HOTAIR variants and CRC prevalence in the Saudi population. These SNPs were selected on the basis of some recent reports linking them with increased cancer risk. Few molecular epidemiological studies linked HOTAIR rs920778 polymorphism with the risk of breast, cervical, and lung cancer; rs1899663 with lung, breast, and gastric cancer; and rs12816786 with lymphoma [31][32][33][34][35].

Subject Requirement
In the present case-control study of 288 individuals, 144 individuals with confirmed CRC cases were enrolled with an equal number of individual controls matched with age and sex from King Khalid University Hospital (KKUH) in Riyadh, Saudi Arabia. The study was approved by the ethical committee of KKUH. Informed consent from all the subjects was obtained.

Sample Collection
A total of 4 mL of peripheral blood was collected in ethylene diamine tetra acetic acid (EDTA) tubes (Hebei Xinle Sci & Tech Co. Shijiazhuang, China) from all the participants, and genomic DNA was isolated by using a Mini Kit from QI Aamp ® . The purity and concentration of the extracted nucleic acids were quantified from the OD ratio (A260/280 nm).

SNP Selection and Genotyping
Three SNPs (rs920778, rs1899663, and rs12826786) in the oncogene lncRNA HOTAIR on Chr. 12 were examined. Genotyping was performed using TaqMan ® Genotyping and was analyzed using Quantstudio-7 (Applied Biosystems ® , Life Technologies™, Carlsbad, CA, USA). The TaqMan ® genotyping reaction mix (10.2 µL/reaction) contained 8.2 µL of TaqMan genotyping master mix and 2 µL of a 20 ng/µL genomic DNA sample in 96-well plates according to the manufacturer's instructions. The reactions were performed using Genes 2023, 14, 592 3 of 11 the optimal thermocycler conditions for each target SNP: 30 s at 60 • C (preread Stage), 10 min at 95 • C (hold stage), then PCR stages 40 cycles of 15 s at 95 • C (denaturation) and 1 min at 60 • C (annealing), then the postreading stage of 30 s at 60 • C. The experiments were performed in triplicate as a quality control measure to verify the genotyping procedure.

Statistical Analyses
The data underwent statistical analysis by using IBM SPSS (Statistics for Windows, Version 23.0. IBM Corp, Armonk, NY, USA) and Microsoft Excel ® . All reported p-values were two tailed and a p-value less than 0.05 was specified as statistically significant. The differences in the demographic variables and genotypes of the HOTAIR polymorphic variants (SNPs) (rs920778, rs1899663, and rs12826786) between the CRC cases and healthy controls were evaluated using the chi-squared (χ2) test. The Hardy-Weinberg equilibrium (HWE) was tested with a (χ2) test to study the likelihood of inheriting these HOTAIR (SNPs) into the Saudi population by comparing the results of the frequencies of the expected genotypes with the frequencies of the observed genotypes. The odds ratio (OR) was used to estimate the degree of correlation between HOTAIR genetic variation (SNPs) and the risk of CRC development. The 95% confidence interval (95% CI) indicates that the average of the actual value should be within the range. The 95% CI was considered significant if the difference range between the two values was less than 1. The OR and 95% CI were calculated using an (IHG) web tool (https://ihg.gsf.de/cgi-bin/hw/hwa1.pl, accessed on 6 February 2023). The RNAsnp Web server (https://rth.dk/resources/rnasnp/, accessed on 6 February 2023was used to predict the secondary structure of the HOTAIR variants with each SNP. A linkage disequilibrium (LD) analysis determines the degree of nonrandom associations between HOTAIR (SNPs). The LD analysis was carried out using Haploview.

Results
The clinical data of all the samples are summarized in Table 1. The CRC patients had a median age of 57 years, and the samples were divided into two groups: above 57 years old (n = 76, 52.7%) and below 57 years old (n = 68, 47.3%), with 86 males (59.7%) and 58 females (40.3%). In total, 91 patients (63.2%) were diagnosed with colon cancer, while 53 patients (36.8%) had rectal cancer. The stages of CRC included in this study included early stages (I-II) and late stages (III-IV); 61 (54%) of CRC patients had early stages of the disease, while 51 (46%) had a late-stage disease.  Table 2 illustrates the relationships between the genotype and CRC susceptibility, allele frequencies, and the significance of the genotype and allele distribution of several SNPs. Among the three HOTAIR SNPs studied, rs12826786 (C > T) and rs1899663 (C > A) demonstrated a statistically significant protective association (decreased odds ratio) in the Saudi CRC patients, while a third genotype rs920778 (G > A) demonstrated a statistically significant risk association (increased odds ratio) in the Saudi CRC patients. The mutanthomozygous genotype "AA" of rs920778 showed a significant risk correlation (OR (95% CI): 2.057 (1.063-3.981); χ2 = 4.64; p = 0.03131), and a minor allele "A" demonstrated a significance risk correlation (OR (95% CI): 1.438 (1.035-1.998); χ2 = 4.71; p= 0.02994). The additive genotype "GA + AA" showed a risk association with CRC but was not significant (OR (95% CI): 1.557 (0.931-2.606); χ2 = 2.86; p = 0.09078). The mutant-homozygous genotype "TT" for rs12826786 showed a significant protective correlation (OR (95% CI): 0.276 (0.074-1.020); χ2 = 4.18; p = 0.040) in the CRC patients, as did the additive genotype "CA + AA" of SNP rs1899663 (C > A) (OR (95% CI): 0.305 (0.131-0.710); χ2 = 8.21; p = 0.00417). In the present study, the median age of the CRC and control subjects was 57. To estimate the association of HOTAIR SNPs with both CRC and the control patients, samples were divided into two groups (patients ≤ 57 years old and patients > 57 years old). The genotype frequencies of both groups are shown in Table 3. The SNP rs1899663 (C > A) showed a significant protective association in the CRC patients younger than 57 years old. The heterozygous variant "CA" at rs1899663 showed a significant protective association in CRC patients younger than 57 years old (OR: 0.231; χ2 = 5.9; CI: 0.066-0.801; p = 0.0151), as did the additive genotype "CA + AA" (OR: 0.28; χ2 = 4.8; CI: 0.085-0.926; p = 0.0289). The HOTAIR SNP rs920778 (G > A) was found to have a major risk association in CRC patients aged 57 years old. In CRC patients aged 57 years old, the homozygous "AA" mutant frequency was 3.3-fold higher than in healthy people (OR: 3.3; χ2 = 4.13; CI: 1.015-10.733; p = 0.043).  Table 4 showed the correlation of HOTAIR SNPs with gender. The mutant-homozygous genotype "TT" at rs12826786 was shown to have a protective association in male CRC patients (OR: 0.23; χ2 = 3.81; CI: 0.047-1.127; p = 0.051). On the other hand, a statistically significant risk association was observed in females with rs920778 (G > A) as shown in Table 4. The frequency of the homozygous mutant "AA" was increased 3.282-fold in female CRC patients compared to healthy subjects (OR: 3.282; χ2 = 4.91; CI: 1.129-9.536; p = 0.02669). The frequency of the rs920778 minor allele "A" was 1.768-fold higher in female CRC patients compared to controls (OR: 1.768; χ2 = 4.8; CI: 1.060-2.949; p = 0.0284). Furthermore, the correlation of HOTAIR SNPs with tumor location was studied. Samples were classified into two groups according to the location of the tumor, either in the colon or the rectum. Remarkably, rs920778 showed a significant association with tumors located in the colon, while there was no correlation between any SNPs and tumors localized in the rectum ( Table 5). The homozygous genotype "AA" at rs920778 in the patients with colon cancer showed 2.3-fold more significant risk compared to healthy individuals (OR: 2.332; χ2 = 4.75; CI: 1.082-5.027; p = 0.02926). The additive alleles "GA + AA" genotype showed significant association in the CRC patients compared to healthy individuals (OR: 1.895; χ2 = 4.24; CI: 1.027-3.497; p = 0.03940). The frequency of the minor allele "A" also showed significantly more correlation in colon cancer patients when compared to the control group (OR: 1.519; χ2 = 4.84; CI:1.046-2.206; p = 0.02785) ( Table 5). Based on the results shown in Table 5, no association was observed between the HOTAIR SNPs examined (rs920778, rs1899663, and rs12826786) and rectal cancer patients. The correlation between HOTAIR SNPs and the staging of colorectal tumors was studied. The stages of CRC were grouped into early stages (I-II) and late stages (III-IV) as shown in Table 6. The CRC patients with early-stage tumors showed a significantly higher risk (2.6 fold) with the homozygous variant genotype "AA" at rs920778 when compared to healthy people (OR: 2.615; χ2 = 4.73; CI: 1.085-6.304; p = 0.029). The genotype of additive "GA + AA" alleles showed a significant association in CRC patients compared to healthy individuals (OR: 2.042; χ2 = 3.86; CI: 0.994-4.195; p = 0.0493). The frequency of the minor "A" allele also showed a significantly higher association in the CRC patients compared to healthy controls (OR: 1.604; χ2 = 4.76; CI: 1.047-2.455; p = 0.0292) while the SNPs (rs1899663 and rs12826786) did not show any significant correlation with the CRC patients. The homozygous variant genotype "AA" at rs1899663 showed a significant protective association in late-stage CRC tumors compared to healthy individuals (OR: 0.344; χ2 = 4.11; CI: 0.119-0.993; p = 0.0426). However, none of the other HOTAIR SNPs studied (rs920778 and rs1899663) showed a significant association in late-stage CRC tumors.
A linkage disequilibrium (LD) analysis was performed to identify the LD between the SNPs. The LD blocks showed a very low association among the analyzed SNPs (Figure 1). The red color indicates a higher D' value. The selected SNPs showed higher D' values in the controls. All three HOTAIR SNPs studied showed an r2 = 1 value in association with other SNPs. The r2 values indicate that these loci are coinherited in LD. A linkage disequilibrium (LD) analysis was performed to identify the LD between the SNPs. The LD blocks showed a very low association among the analyzed SNPs ( Figure  1). The red color indicates a higher D' value. The selected SNPs showed higher D' values in the controls. All three HOTAIR SNPs studied showed an r2 = 1 value in association with other SNPs. The r2 values indicate that these loci are coinherited in LD. Secondary structures of HOTAIR and base-pair probabilities were detected using an RNA Web server as shown in Figure 2. The RNAsnp predicted that a mutation at rs920778 would change the RNA secondary structure of the HOTAIR lncRNA. The RNAsnp predicted that the rs920778 G > A allele substitution would result in an MFE of −111.10 to −110.80 kcal/mol. The base-pair probabilities of the rs920778 wild type G allele and variant A allele were also different. The RNAsnp predicted that the rs12826786 C > A allele substitution would result in an MFE of −157.70 to −156.20 kcal/mol. The base-pair probabilities of the rs12826786 wild type C allele and variant A allele were also different. The RNAsnp predicted that the rs1899663 C> T allele substitution would result in an MFE of −120.90 to −121.80 kcal/mol. The base-pair probabilities of the rs1899663 wild type C allele and variant T allele were not significantly different (p-value = 0.0978). −110.80 kcal/mol. The base-pair probabilities of the rs920778 wild type G allele and variant A allele were also different. The RNAsnp predicted that the rs12826786 C > A allele substitution would result in an MFE of −157.70 to −156.20 kcal/mol. The base-pair probabilities of the rs12826786 wild type C allele and variant A allele were also different. The RNAsnp predicted that the rs1899663 C> T allele substitution would result in an MFE of −120.90 to −121.80 kcal/mol. The base-pair probabilities of the rs1899663 wild type C allele and variant T allele were not significantly different (p-value = 0.0978).

Discussion
HOTAIR has gained widespread recognition as a functional lncRNA involved in a number of malignancies. It is situated on chromosome 12 within the Homeobox C gene cluster [36]. Research has shown that HOTAIR interacts with epigenetic regulators such as the lysine-specific demethylase 1A (LSD1) and polycomb repressive complex 2 (PRC2) complexes to regulate the epigenetic silencing of several cancer-related genes, including the HOXD gene [37,38]. Researchers mainly focus on the deregulation of HOTAIR in many forms of cancer due to its remarkable effect on epigenetic regulation at the genome-wide level. In the present study, we analyzed the influence of genetic variations in HOTAIR on the risk of developing CRC in the Saudi population. The increased expression level of HOTAIR is reported in many different types of cancer tissue samples with higher levels in the metastatic stage. Many in vivo and in vitro studies showed the upregulation of HO-TAIR expression with enhanced tumor invasion and metastasis [39][40][41]. In CRC patients, many studies revealed a higher expression of HOTAIR in CRC tissue as compared to corresponding noncancerous tissue [42][43][44]. The SNPs rs920778 and rs12826786 are reported to correlate with HOTAIR upregulation [31][32][33]. The AA genotype in rs920778 SNP (G > A) situated in the intronic enhancer region can increase the expression of HOTAIR [45]. Presently, we reported a strong association of rs920778 G > A with increased CRC risk in the Saudi population, and a potential biomarker, the homozygous mutant genotype "AA", showed a significant association with the risk of CRC, particularly with gender, age, and

Discussion
HOTAIR has gained widespread recognition as a functional lncRNA involved in a number of malignancies. It is situated on chromosome 12 within the Homeobox C gene cluster [36]. Research has shown that HOTAIR interacts with epigenetic regulators such as the lysine-specific demethylase 1A (LSD1) and polycomb repressive complex 2 (PRC2) complexes to regulate the epigenetic silencing of several cancer-related genes, including the HOXD gene [37,38]. Researchers mainly focus on the deregulation of HOTAIR in many forms of cancer due to its remarkable effect on epigenetic regulation at the genome-wide level. In the present study, we analyzed the influence of genetic variations in HOTAIR on the risk of developing CRC in the Saudi population. The increased expression level of HOTAIR is reported in many different types of cancer tissue samples with higher levels in the metastatic stage. Many in vivo and in vitro studies showed the upregulation of HOTAIR expression with enhanced tumor invasion and metastasis [39][40][41]. In CRC patients, many studies revealed a higher expression of HOTAIR in CRC tissue as compared to corresponding noncancerous tissue [42][43][44]. The SNPs rs920778 and rs12826786 are reported to correlate with HOTAIR upregulation [31][32][33]. The AA genotype in rs920778 SNP (G > A) situated in the intronic enhancer region can increase the expression of HOTAIR [45]. Presently, we reported a strong association of rs920778 G > A with increased CRC risk in the Saudi population, and a potential biomarker, the homozygous mutant genotype "AA", showed a significant association with the risk of CRC, particularly with gender, age, and location. Some previous studies have reported the association of rs920778 SNP with increased cancer risk in Turkish, Indian, and Chinses populations [32,34,45]. Recently, HOTAIR rs920778 polymorphisms with higher survival rates were reported in CRC patients from South Korea [19] and in breast cancer patients in southeast Iran [46].
The AA genotype in HOTAIR rs1899663 SNP C > A situated at the intronic region can alter the affinity for binding of several transcription factors, including paired Box 4, spermatogenic leucine zipper 1, and zinc finger protein 281, resulting in increased expression levels of HOTAIR [47]. We observed a significant protective association of the additive genotype "GA + AA" of HOTAIR rs1899663C > A in CRC susceptibility in Saudi CRC patients, and the minor allele "A" showed a decreased OR in patients who had tumors located in the colon. The homozygous variant genotype "AA" was associated with late-stage tumors (III-IV), and the genotype "CA + AA" showed a significant protective association in younger patients while it had a nonsignificant risk association in patients older than 57 years of age, which suggests that rs1899663 had a protective role in younger Saudi patients. Earlier in the South Korean population, the TT genotype was reported to increase mortality in CRC patients who displayed tumors in the colon region only [19]. Wang et al. [33] reported that rs1899663 (G > T) increased lung cancer risk in China, and Hassanzarei et al. [46] reported a negative association of rs1899663 (G > T) with breast cancer in Iran. Additionally, rs1899663 polymorphism is linked with breast cancer risk in Indian and Chinese populations and prostate cancer susceptibility in the Iranian population [32,34,47].
The genotype TT in rs12826786 can be considered a protective mutation against susceptibility to CRC in the Saudi population as it was associated with a reduced risk of CRC in male patients and patients with advanced tumors. Our results are consistent with Iranian populations [39] where an rs12826786 (C > T) polymorphism showed a protective association with breast cancer. Likewise, Kashani et al. [35] did not find any association between HOTAIR rs12826786 and Lymphoma risk while Hassanzarei et al. [46] reported a decreased risk of breast cancer with rs12826786 polymorphisms.
An LD analysis was performed for these three SNPs (rs920778 G > A, rs12816786 C > T, and rs1899663 C> A) to determine the potential for the nonrandom association of alleles in the Saudi population. The LD pattern among the SNPs was measured using the correlation coefficient D' (r2). Our results showed LD blocks among the three SNPs selected; the r2 values indicate that these loci are coinherited and that there was a tendency for alleles to be transmitted together. However, there was no clear association between these alleles in the Saudi population. After performing the RNAsnp prediction analysis, the results indicated that the RNA secondary structures of the HOTAIR genotypes were slightly changed, indicating that these SNPs may participate in colorectal cancer via the alteration of the HOTAIR lncRNA secondary structure. This may contribute to the reduced efficiency of their function.

Conclusions
This study is the first to analyze the association between HOTAIR gene polymorphisms and the prevalence of CRC in the Saudi population. Our results suggested a strong association of rs920778 with increased CRC risk while rs12826786 and rs1899663 showed a protective role in this population. HOTAIR polymorphism can represent a useful biomarker for the early diagnosis of CRC. The size of our sample was small; thus, our findings must be proven in a larger number of samples to improve the scientific rigor. In the present study, we collected information about the patient's age, gender, tumor stage, and location, but not considering familial CRC history is the main limitation of this study. We hope that this work will help improve the understanding of cancer mechanisms and stimulate the discovery of therapeutic targets that are less harmful than chemotherapy.