CD44 Gene Polymorphisms as a Risk Factor for Susceptibility and Their Effect on the Clinicopathological Characteristics of Lung Adenocarcinoma in Male Patients.

Lung adenocarcinoma is a subtype of lung cancer with high morbidity and mortality. CD44 is instrumental in many physiological and tumor pathological processes. The expression of unique single nucleotide polymorphisms (SNPs) contributes to protein dysfunction and influences cancer susceptibility. In the current study, we investigated the relationship between CD44 polymorphisms and the susceptibility to lung adenocarcinoma with or without epidermal growth factor receptor (EGFR) gene mutations. This study included 279 patients with lung adenocarcinoma. In total, six CD44 SNPs (rs1425802, rs11821102, rs10836347, rs13347, rs187115, and rs713330) were genotyped using a real-time polymerase chain reaction. We found no significant differences in genotype distribution of CD44 polymorphisms between EGFR wild-type and EGFR mutation type in patients with lung adenocarcinoma. We observed a strong association between CD44 rs11821102 G/A polymorphism and EGFR L858R mutation (odds ratio (OR) = 3.846, 95% confidence interval (CI) = 1.018–14.538; p = 0.037) compared with the EGFR wild-type group. In the subgroup of male patients with lung adenocarcinoma harboring the EGFR wild-type, both CD44 rs713330 T/C (OR = 4.317, 95% CI = 1.029–18.115; p = 0.035) and rs10836347 C/T polymorphisms (OR = 9.391, 95% CI = 1.061–83.136; p = 0.019) exhibited significant associations with tumor size and invasion. Data from the present study suggest that CD44 SNPs may help to predict cancer susceptibility and tumor growth in male patients with lung adenocarcinoma.


Introduction
Lung cancer is the leading cause of cancer-related death worldwide, and its morbidity and mortality are increasing [1]. Each year, non-small cell lung cancer (NSCLC) accounts for approximately 85% of newly diagnosed lung cancer cases, and adenocarcinoma, a subtype of NSCLC, accounts for approximately 50% of all lung cancers. Many risk factors, such as genetic, behavioral, and environmental factors, have been shown to be involved in the development of lung cancer. For example, studies of lung adenocarcinoma have revealed that epidermal growth factor receptor (EGFR) gene mutations occur more frequently in Asian populations than in Caucasians [2]. Among several EGFR mutations, two hotspot mutations, namely L858R and Exon 19-del mutations, comprise of the vast majority in patients with lung cancer [3]. Single nucleotide polymorphisms (SNPs) are the most common type of genetic variation [4]. The expression of SNPs reportedly contributes to protein dysfunction and has an effect on disease susceptibility [5]. Therefore, identification of gene polymorphisms provides useful clinical information for the prediction and diagnosis of certain diseases, including lung cancer.
Human CD44 is located on chromosomal locus 11p13 and contains 20 exons, including 10 constant exons and 10 variant exons [6,7]. Through alternative splicing and post-translational modifications of mRNA, various CD44 protein isoforms are generated, including the standard form and spliced variant isoforms [8]. Each of the CD44 isoforms has multiple biological functions in normal and tumor cells. Transmembrane glycoprotein CD44 participates in many physiological processes, such as cell adhesion, migration, and inflammation. Additionally, CD44 proteins are involved in tumor pathological processes, including cell proliferation, angiogenesis, invasion, and metastasis [9]. Overexpression of CD44 proteins may be a poor prognostic indicator for patients with NSCLC [10]. In fact, both CD44 mRNA [11] and protein [12] are highly expressed in the tumor tissues of patients with NSCLC. Moreover, the expression of CD44 isoforms is reportedly highly correlated with the production of various tumor subtypes and has been used as a marker for cancer stem cells in many cancers [13]. However, whether CD44 genetic polymorphisms affect the subtype and the content of these CD44 isoforms in various cancers is worthy of further research.
Many studies have indicated that CD44 polymorphisms are a risk factor for susceptibility to different cancers [14][15][16][17][18][19][20]. However, the role of CD44 polymorphisms in the clinicopathological characteristics of lung adenocarcinoma is still unclear. In the present study, we aim to clarify the relationship between CD44 polymorphisms and susceptibility to lung adenocarcinoma with or without EGFR mutations. The associations among CD44 SNPs, EGFR mutations, and the clinicopathological characteristics of lung adenocarcinoma are also evaluated.

Subjects and Specimen Collection
This study included 279 patients with lung adenocarcinoma at Cheng-Ching General Hospital, Taiwan, between 2012 and 2015. Tumor specimens were collected for EGFR gene sequencing, and whole-blood specimens were collected for CD44 genotyping. Whole-blood specimens were collected from patients using EDTA-containing tubes. The clinical data of the enrolled patients was obtained from their medical records. The tumor was staged according to the tumor/node/metastasis (TNM) staging system of the American Joint Committee on Cancer (AJCC) at the time of diagnosis, and lifestyle variables (such as cigarette smoking) were collected using questionnaires. The study protocol was approved by the Institutional Review Board of Cheng-Ching General Hospital (No. HP120009; 22 September 2012). Signed informed consent was obtained from each participate before the initiation of the study.

Selection of CD44 Polymorphisms
In this study, 6 SNPs in the CD44 genome region were selected from International HapMap Project data. One SNP (rs1425802) in the promoter region and 3 SNPs (rs11821102, rs10836347, and rs13347) in the 3' untranslated region (3' UTR) of CD44 were selected. In addition, 2 SNPs (rs187115 and rs713330) were selected because they have been associated with several cancers in Chinese Han populations [15][16][17][18].

Genomic DNA Extraction and Real-Time Polymerase Chain Reaction
Genomic DNA was extracted from whole-blood specimens of patients with lung adenocarcinoma using the QIAamp DNA Tissue kit and the QIAamp DNA Blood Mini Kit (Qiagen, Valencia, CA, USA), respectively, according to the manufacturer's protocols. After preparing the DNA, it was aliquoted and stored at −20 • C and used as templates for the following experiments. Exons 18-21 of EGFR were amplified using a polymer chain reaction (PCR) and then subjected to DNA sequencing as described previously [21]. The allelic identification of 6 CD44 SNPs (rs1425802, rs11821102, rs10836347, rs13347, rs187115, and rs713330) was examined using the TaqMan SNP genotyping assay and the ABI StepOnePlus Real-Time PCR system (Applied Biosystems, Foster City, CA, USA).

Statistical Analysis
The Mann-Whitney U test and Fisher's exact test were used to compare differences in the distributions of clinical characteristics and genotype frequencies between lung adenocarcinoma patients harboring the EGFR wild-type and EGFR mutation type. The odds ratios (ORs) with corresponding 95% confidence intervals (CIs) of the association between the genotype frequencies and risk of EGFR types were estimated using multiple logistic regression models after controlling for other variables. The p-value less than 0.05 was indicated statistical significant. All data were analyzed using SAS (version 9.1, 2005; SAS Institute Inc., Cary, NC, USA).

Results
To explore the effects of CD44 polymorphisms on lung adenocarcinoma risk, a total of 279 whole-blood specimens were collected from lung adenocarcinoma patients. EGFR sequencing data were used to divide these specimens into the EGFR wild-type (n = 110, 39.4%) and EGFR mutation type (n = 169, 60.6%). The clinical characteristics of enrolled patients are presented in Table 1. We observed significant differences when grouping patients by sex (p < 0.001), cigarette smoking (p < 0.001), and cell differentiation (p = 0.001) in the patients with lung adenocarcinoma with the EGFR wild-type and those with the EGFR mutation type (Table 1). In comparison with the EGFR wild-type group, the EGFR mutation type group had a higher proportion of female patients (n = 109, 64.5%), nonsmokers (n = 131, 77.5%), well-differentiated tumors (n = 21, 12.4%), and moderately differentiated tumors (n = 138, 81.7%; Table 1). The genotype distributions and associations between CD44 polymorphisms and lung adenocarcinoma are listed in Table 2. In the recruited lung adenocarcinoma patients, the alleles with the highest distribution frequency for rs1425802, rs187115, rs713330, rs11821102, rs10836347, and rs13347 were heterozygous for A/G, homozygous for A/A, homozygous for T/T, homozygous for G/G, homozygous for C/C, and heterozygous for C/T, respectively. To minimize the interference of confounding variables, adjusted odds ratios with 95% confidence intervals (CIs) were estimated using multiple logistic regression models after adjustment for variance. No significant differences were observed in the genotype distributions of CD44 polymorphisms between the EGFR wild-type and EGFR mutation type in the patients with lung adenocarcinoma ( Table 2). In addition, there were no significant associations between CD44 polymorphism and the cancer susceptibility in terms of sex (Table A1), smoking manner (Table A2), and cell differentiation (Table A3).
Due to the different sex distribution of EGFR status in patients with lung adenocarcinoma in Taiwan, the association between CD44 polymorphisms and two EGFR hotspot mutations (L858R and exon19 in-frame deletion) in male patients was investigated. In comparison with the EGFR wild-type group, an association was observed between CD44 rs11821102 G/A polymorphism and EGFR L858R mutation (AOR = 3.991, 95% CI = 1.002-16.499; p < 0.05) but not for EGFR exon19 in-frame deletion ( Table 3). No significant association was found between the other five CD44 polymorphisms and the two EGFR hotspot mutations in male patients (Table 3). In addition, no significant association was noted between CD44 polymorphisms and the 2 EGFR hotspot mutations in female patients (data not shown). The data suggested that men carrying at least one A allele in CD44 rs11821102 had a higher risk of developing lung adenocarcinoma harboring the EGFR L858R mutation than those carrying homozygous GG alleles.
Subsequently, we investigated the relationships between CD44 polymorphisms and clinicopathological characteristics of male patients with lung adenocarcinoma (n = 126). Using the AJCC TNM staging system for lung cancer [22], we observed that the CD44 rs713330 T/C polymorphism was highly associated with the "T" classification (AOR = 4.250, 95% CI = 1.529-11.814; p = 0.006) in male patients with lung adenocarcinoma (Table 4). To further examine the association between CD44 polymorphisms and clinicopathological characteristics of EGFR status, these male patients were stratified into EGFR wild-type (n = 66) and EGFR mutation type (n = 60) groups. In the EGFR wild-type group, both CD44 rs713330 T/C (AOR = 5.398, 95% CI = 1.203-24.223; p = 0.028) and rs10836347 C/T polymorphisms (AOR = 9.136, 95% CI = 1.028-81.200; p = 0.047) exhibited significant associations with the AJCC "T" classification (Table 4). These findings indicated that the CD44 rs713330 T/C polymorphism might be associated with tumor size and invasion in male patients with lung adenocarcinoma, particularly those with wild-type EGFR. The adjusted odds ratios (AORs) with 95% confidence intervals (Cis) were estimated by multiple logistic regression models after controlling for age and gender. Abbreviations: SNP, single nucleotide polymorphism; AOR, adjusted odds ratio; CI, confidence interval.

Discussion
In the present study, we demonstrated the clinical relevance of CD44 polymorphisms in male patients with lung adenocarcinoma. We revealed that these patients with the CD44 rs11821102-variant genotypes (GA + AA) exhibited a higher risk of lung adenocarcinoma with the development of the EGFR L858R mutation. Furthermore, the CD44 rs713330 T/C polymorphism exhibited a significant association with the AJCC "T" stage classification. The classification of the "T" stage depends on the size of the primary tumor and its invasion of adjacent tissues [22]. Therefore, the CD44 rs713330 polymorphism may be associated with the primary tumor size and invasion of lung adenocarcinoma in male patients, particularly those with wild-type EGFR.
EGFR mutations have been found to predominantly occur in patients with lung adenocarcinoma and are more frequent in female patients and nonsmokers [2,[23][24][25]. Most of the recruited patients with lung adenocarcinoma in the present study had EGFR mutation. Data indicated that EGFR mutations were more common in female patients than in their male counterparts, more common in those that had never smoked than in those who do smoke, and more common in well and moderately differentiated lung tumors in the collected samples. In addition, EGFR mutations are the prevalence of EGFR mutations in Asian populations is higher than that in Caucasians [2]. Indeed, the different sex distribution of the EGFR status in patients with lung adenocarcinoma is found in Taiwan. Since our study only explores the Taiwanese group, it needs to further study whether the CD44 gene polymorphism is as a risk factor for the susceptibility of male lung adenocarcinoma, particularly those with wild-type EGFR, in other races. Here we found that a strong association was observed between CD44 rs11821102 G/A polymorphism and EGFR L858R mutation in male patients, but not in female patients. The present data showed that sexual dimorphism was observed in the association between CD44 polymorphisms and the susceptibility of lung adenocarcinoma. Sex difference has been observed in most human diseases, including lung cancer, colorectal cancer, and malignant melanoma [21,26,27]. Previous studies have shown that polymorphisms in CYP1A1 and GSTM1 [28] and EGFR [21] contribute to the increased risk of females for lung cancer. Our data and these previous studies suggest that the genetic and hormonal diversity between men and women could affect the gene expression patterns, leading to different cancer progression and risks in the specific subgroup.
Human CD44 encodes many protein isoforms. These CD44 isoforms have been reported to participate in many biological processes [9,13]. Here we demonstrated that male individuals carrying at least one A allele in CD44 rs11821102 had a higher risk of lung adenocarcinoma harboring the EGFR L858R mutation than those carrying homozygous GG alleles. Rs11821102 is located in the noncoding CD44 3' UTR, which affects certain microRNA binding capabilities and then interferes with mRNA expression [29]. The data indicated that the CD44 rs11821102 G/A polymorphism might influence unique microRNA, thereby altering specific CD44 mRNA subtypes in male patients with lung adenocarcinoma harboring the EGFR L858R mutation. Future research could consider which microRNA is affected by CD44 rs11821102 G/A polymorphism and investigate the underlying mechanisms.
Studies have been revealed that overexpression of the CD44 variant exon 6 is associated with tumor differentiation, clinical TNM stage, and lymph node metastasis in patients with NSCLC [10,30]. The expression of CD44 proteins in tumor tissues and serum samples from NSCLC has been found to be significantly associated with clinicopathological factors including T stage, N stage, and pathological stage [12,31]. We observed that the CD44 rs713330 T/C polymorphism was highly associated with primary tumor size and invasion (AJCC "T" classification) in male patients with lung adenocarcinoma. Rs713330, located in the intron of CD44, is linked to the disequilibrium of the nonsynonymous rs9666607 G/A polymorphism, which may change arginine to lysine at residue 417 [15]. The present data suggest that the CD44 rs713330 T/C polymorphism may change the CD44 protein subtype expression to be associated with tumor size and invasion in male patients with lung adenocarcinoma, particularly those with wild-type EGFR. Furthermore, in the subgroup of male patients harboring EGFR wild-type, the rs10836347 C/T polymorphism was also associated with tumor size and invasion. CD44 polymorphisms may cause changes in the subtype and content of the mRNA and/or protein of CD44. Therefore, whether CD44 genetic polymorphisms, such as rs713330 and rs10836347, directly affect the clinicopathological characteristics of lung adenocarcinoma merits further attention.
Several SNPs of CD44 have been reported to have significant correlations with the risk of various cancers in Asians [20,32]. The rs10836347 C/T polymorphism is associated with a higher risk of hepatocellular carcinoma in individuals who consume alcohol [33]. To our knowledge, this is the first report indicating that rs10836347 C/T and rs713330 T/C polymorphisms may increase lung adenocarcinoma risk and cause tumor growth, respectively, in male patients. Liu et al. revealed that rs187115 sites of the G allele significantly increase the risk of NSCLC and bone metastases compared with those of the AA genotype [34]. However, we did not observe the association between rs187115 and lung adenocarcinoma risk, which may be attributed to our lack of healthy controls and limited study samples.
Certain limitations still exist in the current study. Due to the lack of the overall survival and disease-free survival data, the impact between the CD44 polymorphism and follow up clinical data could not be performed. Moreover, it needs to conduct further research on larger cohorts, different races and even the international multicenters to investigate and confirm the association between these CD44 SNPs and clinicopathological characteristics patients with lung adenocarcinoma.

Conclusions
The present results suggest that CD44 rs11821102 G/A polymorphism is a potential candidate target for the prediction of lung adenocarcinoma with the EGFR L858R mutation in male patients. Our data enhance understanding of the involvement of CD44 SNPs in cancer susceptibility and clinicopathological characteristics in male patients with lung adenocarcinoma. Data from the present study suggest that CD44 SNPs may be a useful prediction marker for male patients with lung adenocarcinoma.

Conflicts of Interest:
The authors declare no conflict of interest.
Appendix A Table A1. The associations between the polymorphisms of CD44 and the sex status in lung adenocarcinoma.

SNP Genotypes
All  The AORs with 95% CIs were estimated by multiple logistic regression models after controlling for age and gender. Abbreviations: SNP, single nucleotide polymorphism; AOR, adjusted odds ratio; CI, confidence interval.