The Relationship between Long Noncoding RNA H19 Polymorphism and the Epidermal Growth Factor Receptor Phenotypes on the Clinicopathological Characteristics of Lung Adenocarcinoma

The aim of the current study is to investigate potential associations among Long Noncoding RNA (LncRNA) H19 single nucleotide polymorphism (SNP) and epidermal growth factor receptor (EGFR) phenotypes on the clinicopathological characteristics of lung adenocarcinoma (LADC). Five loci of LncRNA H19 SNPs (rs217727, rs2107425, rs2839698, rs3024270, and rs3741219) were genotyped by using TaqMan allelic discrimination in 223 LADC patients with wild-type EGFR phenotype and 323 LADC individuals with EGFR mutations. After the statistical analyses, patients with the EGFR mutation were related to a higher distribution frequency of rs217727 SNP CT heterozygote (p = 0.030), and the female population with EGFR mutation demonstrated a higher distribution frequency of rs217727 SNP CT heterozygote (p < 0.001) and rs2107425 CT heterozygote (p = 0.002). In addition, the presence of LncRNA H19 SNP rs217727 T allele (CT + TT) in patients with EGFR wild-type was associated to higher tumor T status (stage III or IV, p = 0.037) and poorer cell differentiation status (poor differentiation, p = 0.012) compared to those EGFR wild-type individuals with LncRNA H19 SNP rs217727 CC allele. Besides, a prominently higher tumor T status was found in subjects with LncRNA H19 SNP rs2107425 T allele (CT + TT) (stage III or IV, p = 0.007) compared to EGFR wild-type LADC individuals with LncRNA CC allele in EGFR wild-type patients. Our findings suggest that the presence of LncRNA H19 SNP rs217727 is related to the EGFR mutation in LADC patients, and the LncRNA H19 SNP rs217727 and rs2107425 are associated with progressed tumor status for LADC patients with EGFR wild-type.


Introduction
Lung adenocarcinoma (LADC) is the most common type of lung cancer throughout the world in both genders [1]. In previous epidemiological studies, the prevalence of LADC is more than 40 percent in a Chinese population [2] which also continuously outnumbered the squamous cell carcinoma for female population in Europe, North America and Oceania [3]. The known populations that are susceptible to LADC include older age, female sex, nontobacco consumption and presence of certain genetic mutations [4,5]. In addition to conventional treatments for LADC, the new management like the target therapy and immunotherapy for LADC is under investigation in several experimental researches to resolve the drug-resistance issue [6].
There are several molecular biomarkers that could influence the clinical presentation of LADC and could be an alternative management for LADC in the future [7,8]. The different mutation of epidermal growth factor receptor (EGFR), a receptor belong to the ERBB family, would significantly affect the stage and treatment outcome in LADC which had been well-established in the previous studies [8,9]. The main EGFR mutation types are the L858R expression and Exon 19 in-frame deletion, which are associated with a better prognosis of LADC compared to the LADC with EGFR wild-type [10][11][12]. In addition to EGFR, the single nucleotide polymorphism (SNP) of other genes would also correlate to different clinical characteristics and treatment outcome of LADC [13,14]. In previous studies, both the endothelial nitric oxide synthase and carbonic anhydrase 9 were proven to alter the tumor grade of LADC but with different effect [15,16]. Besides, the concurrent presence of some SNPs of Aurora kinase A and EGFR mutation were related to an earlier tumor stage of LADC [17]. Consequently, there may exist other genes whose SNP, together with EGFR mutation, could contribute to different clinical characters and severities of LADC.
The Long Noncoding RNA (LncRNA) refers to a family of genetic sequences that do not contribute to the synthesis of protein but can regulate certain cellular functions like chromatin remodeling, cell cycle advancement, and neoplasm formation [18,19]. Among the LncRNA, the LncRNA H19 revealed certain associations to several types of cancers [20][21][22]. For the field of lung cancers, the high level of LncRNA H19 expression would lead to gefitinib resistance in non-small cell lung cancer, [23] and the presence of LncRNA H19 are associated with the proliferation of lung cancer cells [24]. Since some of the LncRNA could affect the susceptibility and clinical characters of EGFR mutation-related LADC [25,26], the SNP of LncRNA H19 may own some synergic effect with other somatic mutation like EGFR on the clinical course of LADC while the current evidence for this issue is inadequate.
Herein, the purpose of the current study is to evaluate the distribution of several LncRNA H19 SNPs in different EGFR phenotypes. Besides, the synergic effects of LncRNA H19 SNPs and EGFR phenotypes on the clinicopathological characteristics of LADC were also investigated.

Ethnic Declaration
The current study adhered to the declaration of Helsinki in 1964 and the corresponding late amendment. In addition, the current study was approved by both the Institutional Review Boards of Chung Shan Medical University Hospital (Project code: CS1-20144). The written informed consent was collected from all the participants in the current study, and it can be provided upon reasonable request.

Subject Selection
A prospective case-control study was conducted in the Chung Shan Medical University Hospital. The subjects which were diagnosed with LADC and followed up more than one year later in the Chung Shan Medical University Hospital were included and constituted as the study group. After the inclusion process, a total number of 546 patients with LADC were enrolled in the current study. The medical records of those participants were reviewed, and then the demographic data such as the age, gender and cigarette smoking status of each subject was recorded. For the tumor grading, the Tumor, Node, Metastasis (TNM) status and the tumor stages were clarified by two pulmonologists/oncologists, and the three extends of cell differentiation including well differentiated, moderately differentiated, and poorly differentiated were categorized via the rules of American Joint Committee on Cancer manual. For the analysis of genetic variants of both the LncRNA H19 and EGFR, venous blood drawing for all the patients were performed in the Chung Shan Medical University Hospital, and the venous blood was then preserved in the ethylenediaminetetraacetic acid-containing tube. After the preservation management, the venous blood sample was immediately centrifuged and preserved in the laboratory refrigerator at near −80 Celsius degrees for the subsequent analyses. In addition to the venous blood sample, the frozen specimen of LADC from each patient was collected after the incisional biopsy. If the genome of either LncRNA H19 or EGFR was degraded before the laboratory analysis procedure, the individual that provided that blood sample and associated frozen specimen were excluded from the current study.

Genomic DNA Extraction and EGFR Sequencing
The DNA extraction and sequencing process of EGFR was done according to previous experience [17]. Briefly, tumor tissue from frozen specimen was collected to extract the DNA via the application of QIAamp DNA Kit (Qiagen, Valencia, CA, USA) according to the manufacturer's guide document. After the DNA genome was obtained, categories of EGFR, including the wild-type and mutation phenotype, were amplified and detected via the use of real-time polymerase chain reaction (PCR). After that, the DNA sequencing reaction was conducted by ABI PRISM 3130XL System (Applied Biosystems, Foster City, CA, USA). Two types of EGFR genomes were categorized after the above procedures: the EGFR wild-type and EGFR mutation type. To be more specific, EGFR mutation that was analyzed in the current study included phenotypes of both the L858R expression as well as Exon 19 in-frame deletion.

The Genotyping of LncRNA H19 SNPs via Real-Time PCR
Five SNPs of LncRNA H19, including the rs217727 (C/T), the rs2107425 (C/T), the rs2839698 (C/T), the rs3024270 (C/G), and the rs3741219 (A/G) were selected due to the related studies conducted earlier [21,22]. About the genotyping steps which use the same method as previous research [17], the DNA was firstly extracted from the leukocytes of the venous blood from each patient via the usage of the QIAamp DNA kits (Qiagen, Valencia, CA, USA) according to manufacturer's guide instruction. After that, the allelic discrimination of the above LncRNA H19 SNPs including the rs217727 (C/T), the rs2107425 (C/T), the rs2839698 (C/T), the rs3024270 (C/G), and the rs3741219 (A/G) was evaluated via the use of ABI StepOne Real-Time PCR System (Applied Biosystems, Foster City, CA, USA). Then the results of Real-Time PCR were analyzed by SDS version 3.0 software (Applied Biosystems) via the TaqMan assay technique to enhance the integrity of Real-Time PCR. The minor allelic frequencies of the five LncRNA H19 SNPs were more than five percent after all the analysis processes.

Statistical Analysis
The SAS version 9.4 (SAS Institute Inc., Cary, NC, USA) was applied for all the statistical analyses of the current study. Descriptive analysis was used to show the demographic data, tumor stage including the TNM status, and tumor cell differentiation between the EGFR wild-type and EGFR mutation subtypes. Then the Mann-Whitney U test and Fisher's exact test was used to survey the difference of both the categorical variables and continuous variables between the EGFR wild-type and mutation subtype. After that, the multiple logistic regression was applied to yield adjusted odds ratios (AOR) with correlated 95% confidence intervals (CI) for the five different LncRNA H19 SNPs' distribution between the wild-type EGFR and EGFR mutation type after adjusting the age, gender and cigarette consumption. A p value that is less than 0.05 was regarded as statistically significant.

Demographics of the Study Population
A total of 546 participants with LADC were recruited in the current study, with 223 EGFR wild-type patients and another 323 subjects with EGFR mutation. The mean age of EGFR wild-type group was 64.60 ± 12.22 and the EGFR mutation group aged 65.33 ± 12.62 and the age did not show significant value between the two groups (p = 0.501). However, the EGFR mutation group contained both more female individuals (p < 0.001) and more never-smokers (p < 0.001). About the characteristics of LADC, the tumor cells in EGFR mutation group revealed a higher rate of well differentiation (p < 0.001). The other clinical characters of the two groups are shown in Table 1.

Distribution Frequency of LncRNA H19 SNP in Different EGFR Phenotypes
About the genotype distribution of LncRNA H19 SNP between different EGFR phenotypes, the rs217727 SNP CT + TT heterozygote showed a higher distribution frequency in the EGFR mutation group (AOR: 1.46, 95% CI: 1.01-2.13, p = 0.047) which was mainly due to the higher distribution frequency of rs217727 SNP CT heterozygote in EGFR mutation group (AOR: 1.56, 95% CI: 1.05-2.34, p = 0.030). However, the distribution frequency of other LncRNA H19 SNPs were similar between the two groups (  (Table 3) Still, the distribution frequency of LncRNA H19 SNP was not affected by EGFR phenotypes in male population (Table 3).   The AORs with 95% CIs were estimated by multiple logistic regression models after controlling for age and cigarette smoking status. N: number; AOR: adjusted odds ratio; CI: confidence interval. * p < 0.05, a p < 0.001, b p = 0.001, c p = 0.002, d p = 0.005, e p = 0.035.

Correlation of Distribution between LncRNA H19 and EGFR Phenotypes to the Clinicopathological Characteristics in LADC
Concerning the difference of clinicopathological characteristics in patients with specific LncRNA H19 SNP and EGFR phenotype distributions, the presence of LncRNA H19 SNP rs217727 T allele (CT + TT) in patients with EGFR wild-type was correlated to advanced tumor T status (stage III or IV, p = 0.037) and poorer cell differentiation status (poor differentiation, p = 0.012) compared to those EGFR wild-type individuals with LncRNA H19 SNP rs217727 CC allele. Besides, all the patients with LncRNA H19 SNP rs2107425 T allele (CT + TT) experienced a severe tumor T status (stage III or IV, p = 0.025), while the phenomenon was more prominent in patients with both EGFR wild-type and LncRNA H19 SNP rs2107425 CT + TT (stage III or IV, p = 0.007) compared to EGFR wild-type LADC individuals with LncRNA H19 SNP rs2107425 CC allele. The other combination of LncRNA H19 SNP and EGFR phenotype did not alter the clinicopathologic characteristics of LADC significantly which are illustrated in Tables 4 and 5. Table 4. Clinicopathologic characteristics of lung adenocarcinoma patients with EGFR mutation, stratified by polymorphic genotypes of H19 rs217727.

Discussion
In the current study, a higher distribution frequency of LncRNA H19 SNP rs217727 was found in the EGFR mutation LADC population. Besides, the female LADC patients with EGFR mutation phenotype showed different distribution of LncRNA H19 SNP rs217727, rs2107425 and rs2839698 compared to EGFR wild-type population. Moreover, the EGFR wild-type LADC subject with LncRNA H19 SNP rs217727 or rs2107425 was correlated to severe tumor grade, especially higher tumor T status.
There are various factors like SNPs that influence the disease severity and progression of LADC [10,[12][13][14]27,28]. The presence of KRAS mutation would lead to poor prognosis [29] and the BRAF-inactivating mutations could result in the development of LADC. Besides, the EGFR is also a genetic prognostic factor for LADC that has been investigated in details and the EGFR phenotype has different effects on the clinical course of LADC [10,11]. In a previous study, the presence of EGFR mutation L858R expression showed a higher tumor disappearance rate of LADC than the EGFR wild-type [30]. Additionally, the EGFR mutation Exon 19 in-frame deletion revealed an intermediate to low pathologic grade tumors of LADC [31]. Moreover, the EGFR mutation can increase the treatment effectiveness for LADC in which the EGFR Exon 19 in-frame deletion mutation increase both the overall survival and progression-free survival rate of LADC receiving target therapy and chemotherapy [32], while several EGFR mutations were associated with a significantly longer overall survival of LADC [33]. In the present study, a higher distribution frequency of LncRNA H19 SNP rs217727 was found in the EGFR mutation LADC population and could be used in future clinical treatments. On the other hand, the LncRNA H19 is an imprinted gene that is located on the chromosome 11p15.5 of the maternal allele [20]. About its primary function, the LncRNA H19 exists in heart, skeletal muscle and breast and can modulate the cell proliferation, cell differentiation and the oncogenic effects for development of neoplasms [19,34,35]. The experimental study showed that an up-regulation process of LncRNA H19 was observed in cancer cell lines compared with the non-cancerous cell lines, which associated with an elevated invasive behavior in tumor cell [20]. In the clinical situation, several neoplasms including hepatocellular carcinoma, urothelial cell carcinoma and non-small cell lung cancer were found to be associated with the existence of LncRNA H19 SNP [18,36,37]. For the lung cancers, the LncRNA H19 SNP rs217727 would lead to higher incidence of lung squamous cell carcinoma and LADC [38].
In previous researches, the somatic receptor mutation to a cancer and the cancerassociated SNPs seldom analyze together. However, the recent research starts showing that the interaction between genetic variations like SNPs and somatic mutations [39], with some articles illustrating the cancer-related SNPs including TERT and HMGB1 being correlated to the chance of EGFR mutation and the clinical characters in EGFR-mutated LADC [16,25,40]. As a result, we speculate that the EGFR mutation is mixed with the individual's whole genome and can be influenced by other cancer-related genes as well as its polymorphisms whether congenitally or progressively during lifetime. On the other hand, the LncRNA H19 SNP is associated with the clinical course of certain types of adenocarcinoma including the breast adenocarcinoma and gastric adenocarcinoma [41,42] and the effect of several LncRNA H19 SNP rs217727 on the susceptibility of LADC is also significant [38]. Moreover, the potential relationship of LncRNA on the EGFR pathway in several cancers had been proven [43,44]. Consequently, we hypothesize the LncRNA H19 SNPs may lead to an alteration of LADC-related genome including the EGFR pathway, and the co-existing of the LncRNA H19 SNPs and EGFR mutation may lead to a modification on the tumorigenesis pathway of LADC, which at least is partially supported by the results of the current study.
The different distribution frequency of LncRNA H19 SNP in EGFR mutation was found in the current study which has seldom been presented elsewhere. The previous review article showed that the LncRNA H19 SNP rs217727 was significantly higher in the oral squamous cell carcinoma and lung cancer [45]. In addition, the T/T genotype of LncRNA H19 SNP rs217727 had a higher risk of lung cancer development than those of C/C genotype [38]. The above evidences suggested the LncRNA H19 SNP rs217727 altered the expressions for different cancers which our study supported with the previous literatures. In the gender subgroup analysis of the current study, the female EGFR mutation population showed a higher distribution frequency of LncRNA H19 SNP rs217727 as well as rs2107425 but a lower distribution frequency of LncRNA H19 SNP rs2839698. The difference of LncRNA H19 SNP distribution between the two gender subgroups may be due to the existence of expression quantitative trait loci which lead to gender-difference expression of LncRNA inter chr4 3011 in mice [46] and the possible difference function of EGFR pathway between gender, which is found in the kidney mouse [47], may also cause such gender difference as the expression of LncRNA PACER in periodontitis [48]. Since the LncRNA H19 SNP rs217727 was related to lung cancer development [45] and LncRNA H19 SNP rs2107425 was correlated to shorter metastasis-free survival [49], it may be reasonable for the similar condition in the current study. On the other hand, the LncRNA H19 SNP rs2839698 was related to the lower risk of bladder cancer [50]. Maybe the LncRNA H19 SNP rs2839698 could serve as a protective factor for at least the two malignancies. Unlike the female population, the male population did not show any tendency of distribution frequency of LncRNA H19 SNP and the exact etiology for such difference needs further investigation to confirm.
Concerning the clinicopathological characteristics of LADC, PD-1 SNP, ANGPT1 SNP and HDAC9 SNP had been proven to influence the tumor progression significantly [51][52][53]. In the current study, nevertheless, the correlation of LncRNA H19 SNP rs217727 and rs2107425 to the tumor progression of LADC with EGFR wild-type has been observed which is a preliminary experience for this field. To be more specific, the LncRNA H19 SNP rs217727 was related to a larger tumor T status and poor cell differentiation of LADC while the LncRNA H19 SNP rs2107425 was associated with a larger tumor T status only. In a preceding research, the poor-differentiated tumor cells of LADC were found in the patients with poor prognosis [54]. As a result, the LncRNA H19 SNP rs217727 might be a more important prognostic factor for LADC with EGFR wild-type than LncRNA H19 SNP rs2107425 because it produces a double risk. Moreover, the LncRNA H19 SNP rs217727 also related to different distribution frequency of EGFR mutation in LADC prominently, thus it may be the most important confounder for LADC considering the synergic effect with EGFR phenotype. However, the presence of LncRNA H19 SNP rs2107425 on the tumor T status is significant even in the enrolled study population with different EGFR phenotype, which implies the influence of LncRNA H19 SNP rs2107425 on LADC disease course cannot be overlooked especially for those with EGFR wild-type.
About the demographic data between the different EGFR phenotypes, the distribution of gender, smoking condition and tumor cell differentiation status demonstrated significant difference between the two groups. The LADC patients with EGFR mutation often develop in female gender and never-smokers according to previous research [5] and the results of the current study further clarify these concepts. The higher rate of EGFR mutation may result from the effect of exposure to carcinogens like female steroid hormones in which the rate of EGFR mutation elevated with age in female population [55]. The higher rate of EGFR mutation in never-smokers was frequently observed in the Eastern Asian population [5] while the exact etiology needs further validation. The percentage of poor-differentiated cell of LADC with EGFR mutation in the current study was significantly lower in the current study compared to the LADC patients with EGFR wild-type, which also corresponded to a previous study conducted in the Eastern Asian population [56]. The pathophysiology for this finding still needs elucidation, and we speculated both the L858R expression and Exon 19 in-frame deletion EGFR mutations may contribute to functional loss of LADC and generally benign clinical characteristics since higher sensitivity to tyrosine kinase inhibitor and better overall survival were found in patients with such mutations [11,57].
There are still some limitations in the current study. First, the small study population may diminish the statistical power in the current study. Second, the lifestyle and daily activity were not recorded in the medical documents thus certain risk factors cannot be considered in the multiple logistic regressions. Third, due to the lack of the survival data, the impact between the H19 polymorphism and follow-up clinical data could not be performed. Moreover, the other EGFR phenotypes such as del2236-2250 and T790M were not enrolled in the analysis of the current study. Still, since the incidence of the above two EGFR phenotypes are relatively low, the influence of not including them might not disturb the results of the current study to a large extent.

Conclusions
In conclusion, the distribution frequency of LncRNA H19 SNP rs217727 CT is higher in patients with LADC and EGFR mutation. Moreover, both the existences of LncRNA H19 SNP rs217727 CT as well as rs2107425 CT are higher in female LADC subjects with EGFR mutation. Furthermore, the LncRNA H19 SNP rs217727 and rs2107425, especially the LncRNA H19 SNP rs217727, are correlated to an advanced tumor status for LADC individuals with EGFR wild-type. Further large prospective studies to evaluate the effect of LncRNA H19 SNP rs217727 or rs2107425 on the disease progression, therapeutic effectiveness and prognosis of EGFR wild-type LADC is mandatory.  Informed Consent Statement: Informed consent was obtained from all subjects involved in the study and can be provided upon reasonable request. Data Availability Statement: Not applicable.

Conflicts of Interest:
The authors declare no conflict of interest.