1. Introduction
Colorectal cancer (CRC) is a significant public health concern worldwide. It is the third-most commonly diagnosed malignancy and the second leading cause of cancer-related deaths globally with significant geographic variation in incidence and mortality rates [
1]. Developed countries in Europe, North America, and Australia report particularly high rates, reflecting the combined impact of lifestyle factors, screening programs, and healthcare systems. In Europe, Hungary has one of the highest incidence and mortality rates. According to the Global Cancer Observatory, for 2022 the age-standardized incidence rate (ASR) is approximately 62.4 and 30.4 per 100,000 for men and women, respectively, and the mortality rate reached 20.2 per 100,000 overall, which places Hungary among the top countries globally [
1]. For the same year, a total of 11,020 new cases were diagnosed according to the Hungarian National Cancer Registry [
2]. The high rates may reflect late-stage diagnosis, which can occur due to inadequate screening programs and public awareness. These data underline the urgent need for improved strategies for prevention and early detection in the Hungarian population. Recently, increasing focus has been placed on the contribution of genetic and epigenetic factors, including long non-coding RNAs (lncRNAs), to CRC pathogenesis. Long non-coding RNAs—a class of RNA molecules longer than 200 nucleotides that do not code for proteins—have emerged in the past few years as key regulators of gene expression at multiple levels, including chromatin remodeling [
3,
4], transcription, and post-transcriptional modifications [
5,
6,
7]. Dysregulation of lncRNAs has been linked to tumor initiation, progression, metastasis, and chemoresistance in several studies [
8,
9,
10,
11,
12,
13]. Importantly, lncRNA genes are often affected by single-nucleotide polymorphisms (SNPs), which may have an impact on their expression or function, thereby modifying individual susceptibility to cancer [
14,
15,
16,
17,
18,
19]. Among the lncRNAs connected to CRC, several have been intensively studied [
20,
21,
22]. Colon cancer-associated transcript 1 (CCAT1) and colon cancer-associated transcript 2 (CCAT2), both located on chromosome 8q24.21, promote tumorigenesis by enhancing
MYC expression and activating WNT/β-catenin signaling with elevated expression in tumor tissues [
23,
24]. H19 is an imprinted lncRNA with increased expression reported in CRC in both serum samples [
25] and tumor tissue [
26,
27], contributing to invasion, metastasis, and poor prognosis. HOX transcript antisense RNA (HOTAIR) transcribed from the
HOXC locus, drives epigenetic silencing of tumor suppressor genes [
28] and is associated with advanced disease and poor survival [
29,
30,
31,
32], based on tumor tissue experiments. Conversely, while papillary thyroid carcinoma susceptibility candidate 3 (PTCSC3) is well established as a tumor-suppressive lncRNA, partly through the inhibition of oncogenic signaling pathways, it is consistently downregulated in thyroid carcinoma and has been reported in several other digestive malignancies [
33,
34,
35,
36,
37]. However, its potential role in CRC remains less defined, with current evidence largely limited to associative observations rather than mechanistic validation.
These molecules also hold considerable promise as diagnostic and prognostic biomarkers, reflecting their altered expression in serum and their high stability in tissue models. Notably, several lncRNAs can be detected in circulation, for example, CCAT1 and CCAT2 levels in serum extracellular vesicles or exosomes closely parallel their abundance in tumor tissue, and their elevated concentrations in both compartments are associated with diagnostic and prognostic significance [
38,
39].
Given the high CRC burden in Hungary and the growing evidence of lncRNA-related genetic susceptibility, this study aimed to investigate particular SNPs within these lncRNA genes in a Hungarian case–control cohort. These SNPs were chosen a priori based on previously published reports (
Table 1) to explore whether there are differences in genotype frequencies between Hungarian patients with colorectal lesions and non-cancer controls. Based on previous evidence implicating these loci in colorectal carcinogenesis, we hypothesized that the selected SNPs would be associated with the risk of colorectal neoplasms in our study population.
2. Materials and Methods
2.1. Patients and Samples
We recruited Hungarian patients from the Department of Surgery of the University of Pécs Medical School starting from September 2022. Informed consent was received from all subjects. A total of 4 mL whole blood was collected in ethylenediaminetetraacetic acid (EDTA) tubes (Greiner Bio-One International GmbH, Kremsmünster, Austria) and stored at −20 °C until DNA isolation. Patients with confirmed colon or rectal cancer or lesions were enrolled in the case group, while individuals without cancer were enrolled as controls. Exclusion criterion for cases included clinically proven presence of any other neoplasm other than colorectal lesions; for controls, exclusion was the presence of any cancer. Both groups consisted of Hungarian individuals aged 28 to 93 years, with no restriction on sex (male-to-female ratios were 1:1.24 in cases and 1:1.65 in controls). The study was approved by the Scientific and Research Ethics Committee of Medical Research Council (ETT-TUKEB) under the registration number 39065-5/2022/EÜIG and was conducted according to the Declaration of Helsinki. Until April 2025, 93 blood samples have been collected and selected for DNA isolation and inclusion in this study.
2.2. DNA Isolation
DNA isolation was performed using High Pure PCR Template Preparation Kit (Roche Diagnostics GmbH, Mannheim, Germany). Isolation was performed from 200 µL whole blood using the working solutions for the preparation and centrifugation steps according to the manufacturer’s protocol. The purity of the isolated DNA was checked by spectrophotometry with MaestroGen Nano spectrophotometer (MaestroGen Inc., Hsinchu City, Taiwan) using the OD ratio of A260/A280 nm.
2.3. Genotyping with qPCR
We performed qPCR-based genotyping with sequence-specific TaqMan® assays (Thermo Fisher Scientific, Waltham, MA, USA) on QuantStudio 12K Flex Real-Time PCR System (Thermo Fisher Scientific, Waltham, MA, USA). Assay IDs: CCAT1 rs6708563: C_189159697_10, CCAT2 rs6983267: C__29086771_20, H19 rs2839698: C___2603701_10, HOTAIR rs7958904: C___2104252_20, HOTAIR rs12826786: C__31185830_10, PTCSC3 rs944289: C___1444137_10. TaqPath™ ProAmp™ mastermix (Thermo Fisher Scientific, Waltham, MA, USA) was used. Thermal profile for all runs were as follows: hold stage: 10:00 min at 95 °C; PCR stage: 00:15 min at 95 °C then 01:00 min at 60 °C; post-read stage: 01:00 min at 60 °C. A total of 43 cycles were used. Automatic genotype calling was performed by the QuantStudio 12K Flex System software with a successful call rate higher than 95% for all assays. Samples with failed calls were excluded from further analyses. All PCR runs were performed in accordance with the given standards and operational procedures, including two negative controls per run, which remained ‘undetermined’ after PCR.
2.4. Statistical Analyses
All statistical analyses were performed using IBM SPSS Statistics version 27 (IBM Corporation, Armonk, NY, USA). Descriptive statistics were calculated for demographic and clinical variables, including means, standard deviations, and frequency distributions. Group differences in age and sex were assessed using χ2 tests. SNPs were analyzed under additive, dominant, and recessive genetic models. Genotype distributions were tested for deviation from Hardy–Weinberg equilibrium (HWE) using χ2 tests. Allelic and genotype associations with disease status were examined using χ2 tests, Fisher’s exact test, and binary logistic regression. Subgroup analyses were conducted by stratifying cases according to tumor location (colon vs. rectum) and sex. To evaluate linear trends across genotype categories, linear-by-linear association tests were used where appropriate. Odds ratios (ORs) and 95% confidence intervals (CIs) were reported for effect estimates. A p-value < 0.05 was considered nominally significant. Multiple testing correction was applied using the Bonferroni method. To evaluate the statistical power of our study, we performed a post hoc power analysis based on the observed average minor allele frequencies (MAF = 0.38) and the total sample size, using χ2-based effect sizes (Cohen’s w).
4. Discussion
In this exploratory genetic association study, we investigated six lncRNA-associated SNPs in relation to colorectal lesion susceptibility, including tumor subtype specificity and sex-dependent effects. Despite the limited sample size, several nominally significant associations emerged, primarily implicating variants within HOTAIR, H19, and PTCSC3 lncRNA genes; however, after Bonferroni correction, none of them remained significant. Although the overall analyses did not reveal any association in genotype or allele frequencies between cases and controls, the genotype frequency comparison revealed suggestive effects for HOTAIR rs7958904 and rs12826786 variants. However, logistic regression identified more specific patterns. Several additional findings demonstrated indicative associations. In our study, CCAT1 rs6708563 and HOTAIR rs7958904 showed a tendency toward increased risk and protective effects, respectively. Similarly, interaction models suggested a possible modifying role of sex for HOTAIR rs12826786 and PTCSC3 rs944289. As these results did not reach nominal significance, these observations should be interpreted more cautiously.
This study offers preliminary insights but has some limitations. First, the relatively small sample size (n = 91) reduces statistical power and precludes more detailed subgroup analyses, which may increase the risk of Type II errors for non-significant SNPs. Second, the case group includes both colorectal cancer and benign colon tumors, which introduce clinical heterogeneity and may dilute potential associations specific to malignant disease. Third, the lack of replication in an independent cohort and the potential population-specific effects may limit the generalizability of our findings. These factors should be considered when interpreting the results and highlight the need for validation in larger, diverse populations.
Previous studies of the HOTAIR rs12826786 polymorphism show heterogeneous results across cancer types and populations. In gastric cardia adenocarcinoma, Guo et al. linked the T allele to higher risk, advanced TNM stage, and increased HOTAIR expression predicting poor survival in Chinese patients [
41]. A meta-analysis by Li et al. of Turkish, Iranian, and Chinese cohorts also associated the T allele and TT genotype with increased risk [
42], whereas no effect was seen in gastric cancer in a Turkish population [
43]. In contrast, in a Saudi CRC cohort Alzeer et al. reported a significant protective association of the TT genotype, particularly in male patients and in tumors located in the colon [
44]. Our Hungarian data are consistent with this observation, with the T allele showing a possible protective association with CRC in additive and dominant models (
p = 0.022 and
p = 0.033, respectively).
The impact of the HOTAIR rs7958904 polymorphism on cancer risk appears to be strongly dependent on cancer type. In Chinese cohorts, several studies reported that the C allele or CC genotype increased susceptibility to cervical and breast cancer [
45,
46]. Akther et al. found similar associations in Bangladeshi women with cervical cancer [
47]. In contrast, Wu et al. showed that the C allele decreased epithelial ovarian cancer risk [
48], and Zhou et al. identified a protective effect against osteosarcoma in a Chinese cohort [
49]. For gastric cancer, the C allele was linked to 1.5-fold increased risk in an Iranian cohort [
50]. Results in CRC are inconsistent as Kim et al. reported that CC genotype predicted poorer prognosis and higher cancer-related mortality in a Korean cohort [
51], whereas Xue et al. demonstrated a protective association of the C allele particularly in older individuals, women, and non-smokers [
52]. A meta-analysis of predominantly Asian populations showed an overall protective effect in gastric and colorectal cancer [
53]. Consistent with these latter findings, our data revealed a possible protective effect of the rs7958904 C allele in the dominant model (
p = 0.043) in females, suggesting rs7958904 may reduce colorectal lesion susceptibility in a sex-dependent manner.
Earlier studies have demonstrated that the H19 rs2839698 A allele increases susceptibility to colorectal and other digestive cancers. In a large Chinese cohort, Li et al. reported higher CRC risk for A allele carriers (OR = 1.20, 95% CI: 1.05–1.36), especially in colon tumors, well-differentiated histology, and advanced Duke’s stage [
54]. Meta-analyses confirmed digestive cancer risk associations under several genetic models [
55,
56,
57]. Wu et al. found similar links for hepatocellular carcinoma in Chinese patients [
58], whereas Verhaegh et al. reported a protective effect in bladder cancer [
59]. In our Hungarian cohort, rs2839698 was not associated with overall colorectal lesion risk; instead, the G allele was more frequent in colon tumors and conferred a potential reduced risk of rectal cancer (
p = 0.029, OR = 0.18), suggesting that rs2839698 may influence tumor subsite rather than general colorectal lesion susceptibility.
Genome-wide association studies identified rs944289 as a susceptibility locus for papillary thyroid carcinoma (PTC). Jendrzejewski et al. showed that this variant modulates PTCSC3 expression, a thyroid-specific tumor suppressor strongly downregulated in tumors, especially in T allele carriers [
60], and meta-analyses confirmed its association with PTC risk [
61,
62,
63,
64]. Although most research has focused on thyroid cancer, rs944289 has also been examined in digestive system cancers. In a Chinese cohort, Cao et al. found no overall association with esophagogastric junction adenocarcinoma (EGJA), but stratified analyses showed a protective CT genotype in individuals under 60 years and increased risk for the TT genotype among smokers [
65]. In another Chinese study, Wang et al. reported a decreased CRC risk for the TT genotype overall and in subgroups, yet tumor site analyses linked TT genotype to increased rectal and colon cancer risk [
37], indicating context- and population-dependent effects. In contrast, our data showed that the C allele (the minor allele in European populations) was associated with lower rectal cancer risk but was more frequent among colon cancer cases, suggesting a possible one-sided effect favoring colon tumor development.
Previous work supports a functional role for HOTAIR rs12826786: in glioma tissue, the CT genotype shows higher HOTAIR expression than TT, indicating a cis-eQTL-like effect [
66]. This variant is not a significant eQTL in GTEx v8 normal tissues [
67], suggesting a tissue-specific regulatory action. For rs7958904, rs2839698, and rs944289, we found no published CRC-specific eQTLs. However, rs7958904 may alter HOTAIR’s secondary structure (in silico CRC-related study [
52]), H19 rs2839698 is linked to higher risk in digestive system cancers [
68], and PTCSC3 rs944289 reduces promoter activity and PTCSC3 expression in thyroid carcinoma [
60] (GWAS Catalog: GCST000335). These findings imply tissue- or context-specific regulation, and the absence of CRC-specific eQTL data likely reflects limited datasets rather than lack of effect, supporting the biological plausibility of our CRC association.
It should be noted that allele frequencies, including which allele is considered the minor allele, may differ substantially between populations. Such differences could partly account for the discrepancies observed between our findings in a European cohort and those reported in Asian populations. Collectively, these observations suggest that lncRNA-associated polymorphisms may exert cancer type and sex-specific effects rather than universal effects across patients with colorectal lesions and controls. While most studies examined these SNPs in larger and mainly Asian cohorts, our findings expand this work with an additional Central European population and specifically with colorectal lesions. The results discussed here imply that the same lncRNA variants could contribute to tumorigenesis differently depending on tissue context, sex, and population-specific genotype or allele frequencies. In this context, the results of the post hoc power analysis provide an important perspective, as the absence of statistically significant associations for some SNPs may reflect the study’s limited power to detect weaker effects rather than a true lack of biological relevance. Despite the small sample size, this cohort represents a less frequent and genetically homogeneous group of Hungarian patients. Therefore, as this was an exploratory phase, we plan a validation phase with a larger patient cohort based on these results.