TMPRSS4: A Novel Tumor Prognostic Indicator for the Stratification of Stage IA Tumors and a Liquid Biopsy Biomarker for NSCLC Patients

Relapse rates in surgically resected non-small-cell lung cancer (NSCLC) patients are between 30% and 45% within five years of diagnosis, which shows the clinical need to identify those patients at high risk of recurrence. The eighth TNM staging system recently refined the classification of NSCLC patients and their associated prognosis, but molecular biomarkers could improve the heterogeneous outcomes found within each stage. Here, using two independent cohorts (MDA and CIMA-CUN) and the eighth TNM classification, we show that TMPRSS4 protein expression is an independent prognostic factor in NSCLC, particularly for patients at stage I: relapse-free survival (RFS) HR, 2.42 (95% CI, 1.47–3.99), p < 0.001; overall survival (OS) HR, 1.99 (95% CI, 1.25–3.16), p = 0.004). In stage IA, high levels of this protein remained associated with worse prognosis (p = 0.002 for RFS and p = 0.001 for OS). As TMPRSS4 expression is epigenetically regulated, methylation status could be used in circulating tumor DNA from liquid biopsies to monitor patients. We developed a digital droplet PCR (ddPCR) method to quantify absolute copy numbers of methylated and unmethylated CpGs within the TMPRSS4 and SHOX2 (as control) promoters in plasma and bronchoalveolar lavage (BAL) samples. In case-control studies, we demonstrated that TMPRSS4 hypomethylation can be used as a diagnostic tool in early stages, with an AUROC of 0.72 (p = 0.008; 91% specificity and 52% sensitivity) for BAL and 0.73 (p = 0.015; 65% specificity and 90% sensitivity) for plasma, in early stages. In conclusion, TMPRSS4 protein expression can be used to stratify patients at high risk of relapse/death in very early stages NSCLC patients. Moreover, analysis of TMPRSS4 methylation status by ddPCR in blood and BAL is feasible and could serve as a non-invasive biomarker to monitor surgically resected patients.

Abstract: Relapse rates in surgically resected non-small-cell lung cancer (NSCLC) patients are between 30% and 45% within five years of diagnosis, which shows the clinical need to identify those patients at high risk of recurrence. The eighth TNM staging system recently refined the classification of NSCLC patients and their associated prognosis, but molecular biomarkers could improve the heterogeneous outcomes found within each stage. Here, using two independent cohorts (MDA and CIMA-CUN) and the eighth TNM classification, we show that TMPRSS4 protein expression is an independent prognostic factor in NSCLC, particularly for patients at stage I: relapse-free survival (RFS) HR, 2.42 (95% CI, 1.47-3.99), p < 0.001; overall survival (OS) HR, 1.99 (95% CI, 1.25-3.16), p = 0.004). In stage IA, high levels of this protein remained associated with worse prognosis (p = 0.002 for RFS and p = 0.001 for OS). As TMPRSS4 expression is epigenetically regulated, methylation status could be used in circulating tumor DNA from liquid biopsies to monitor patients. We developed a digital droplet PCR (ddPCR) method to quantify absolute copy numbers of methylated and unmethylated CpGs within the TMPRSS4 and SHOX2 (as control) promoters in plasma and bronchoalveolar lavage (BAL) samples. In case-control studies, we demonstrated that TMPRSS4 hypomethylation can be used as a diagnostic tool in early stages, with an AUROC of 0.72 (p = 0.008; 91% specificity and 52% sensitivity) for BAL and 0.73 (p = 0.015; 65% specificity and 90% sensitivity) for plasma, in early stages. In conclusion, TMPRSS4 protein expression can be used to stratify patients at high risk of relapse/death in very early stages NSCLC patients. Moreover, analysis of TMPRSS4 methylation status by ddPCR in blood and BAL is feasible and could serve as a non-invasive biomarker to monitor surgically resected patients.

Introduction
Management of lung cancer, the leading type of cancer worldwide, is still challenging, and mortality rates did not substantially decrease in the latest years [1]. Lung cancer is often diagnosed at advanced stages, when curative options are limited. When diagnosed in early stages, patients with lung cancer can be successfully treated with surgery. Nonetheless, recurrence rates after complete surgical resection are between 30% and 45% within five years of diagnosis [2]. Therefore, identification of factors that could predict patients at risk of recurrence after surgery is necessary for a better management of the disease. Currently, the tumor-node-metastasis (TNM) staging system is the only routine method to estimate prognosis in non-small-cell lung cancer (NSCLC) patients. However, in spite of the new classification (eighth edition [3]), survival prediction is not totally accurate, as different clinical outcomes are observed in patients within the same TNM stage. It was suggested that molecular biomarkers could help identifying patients with poor prognosis based on biological malignant features [4], although incorporation of such biomarkers into the clinical practice remains to be implemented.
Transmembrane protease serine 4 (TMPRSS4) is a member of the type II transmembrane serine protease (TTSP) family of genes, located in the long arm of chromosome 11 (11q23.3). TMPRSS4 is highly upregulated in solid tumors (including NSCLC), where it plays a role in facilitating the growth and metastatic spread of cancer cells [5,6]. Upregulation and association with poor prognosis were described for several cancer types, including NSCLC [7], pancreas [8], breast [9,10], and esophageal cancer [11]. The high expression of TMPRSS4 in tumors is a consequence of aberrant hypomethylation, which is also associated with poor prognosis in NSCLC patients [7]. TMPRSS4 provides cancer stem cell (CSC) properties to lung tumor cells and makes them resistant to chemotherapy [5]. We previously showed in animal models that abrogation of TMPRSS4 using short hairpin RNA (shRNA) strategies impedes tumor homing and growth [5], suggesting that targeting this protein in NSCLC may result in a strong therapeutic effect. Two TMPRSS4-specific compounds were recently shown to inhibit protein activity and tumor growth in prostate cancer models [12]. Therefore, TMPRSS4 is an emerging candidate biomarker and therapeutic target in NSCLC patients.
The fact that TMPRSS4 expression is epigenetically regulated by DNA methylation suggests that methylation status could be used as a biomarker in liquid biopsy through analysis of circulating tumor (ct)DNA. Liquid biopsy-based assays are used for diagnosis, prognostication, and monitoring of lung cancer [13]. One important advantage of methylation-based biomarkers is that DNA methylation is a highly stable covalent modification that occurs early during tumor progression and can be detected in fluids by PCR methods. Although the fraction of ctDNA obtained from fluids is often low (<1.0%), highly efficient amplification methods can accurately quantify methylation changes. Digital droplet PCR (ddPCR) is an ultrasensitive technology in which PCR reactions are partitioned in thousands of individual reactions, and it allows absolute quantification of the number of abnormal target DNA. This technique is used to quantify extremely low numbers of DNA copies in liquid biopsies. In lung cancer, ddPCR is used mainly to detect actionable mutations (such as EGFR) in cell-free DNA from plasma samples [14]. However, studies using ddPCR to quantify epigenetic alterations in fluids from NSCLC patients are lacking.
In this study, we addressed whether protein expression of the type II transmembrane serine protease TMPRSS4 could be used as a prognostic indicator in NSCLC tumors, following the eighth TNM classification. In addition, we assessed whether TMPRSS4 promoter methylation status could serve as a biomarker in plasma and bronchoalveolar lavage (BAL) samples to differentiate between NSCLC patients and healthy controls. We also evaluated methylation status of the short stature homeobox 2 (SHOX2), as aberrant hypermethylation of this gene was proven in different studies as a reliable diagnostic biomarker in plasma [15][16][17] and bronchial aspirates [18] and is currently in clinical development.
Using a large cohort of patients and a validation cohort, we show here that TMPRSS4 protein expression is an independent prognostic factor in NSCLC, mainly in stage IA. In addition, we developed a robust ddPCR method to quantify CpG methylation levels within the TMPRSS4 and SHOX2 promoters in blood and BAL that can differentiate between NSCLC patients and tumor-free individuals.
All cell lines were periodically tested with the MycoAlert Mycoplasma Detection Kit (Lonza), to use mycoplasma-free cells.

Cohort of Patients for Immunohistochemical Analysis of TMPRSS4
Samples from primary lung cancer were collected from surgical specimens obtained at the University of Texas MD Anderson Cancer Center (Houston, TX) (MDA cohort) and CIMA-Clinica Universidad de Navarra (Pamplona, Spain) (CIMA-CUN cohort). Inclusion criteria were as follows: patients with complete resection of the primary tumor and absence of chemo or radiotherapy treatment prior to surgery. Lung tumors were classified according to the World Health Organization 2004 classification, and the eighth TNM edition was used for tumor stratification [3]. The MDA cohort was composed of 489 lung cancer patients diagnosed from 2006 to 2009 at the MDA. The CIMA-CUN cohort contained 95 patients diagnosed from 2000 to 2013. Reported recommendations for tumor marker prognostic studies (REMARK) criteria were followed [19]. This study was conducted according to the Declaration of Helsinki, and was approved by the Institutional Review Boards and Ethical committees of the participating institutions. Written informed consent was obtained from each patient. Detailed clinical and pathological information of the cohorts is summarized in Table 1.

Cohort of Patients for Methylation Analysis of TMPRSS4
To evaluate the methylation status of TMPRSS4 and SHOX2 by ddPCR in tissue specimens, a cohort of 59 patients from Hospital General Universitario de Valencia (HGUV) was used (Table S1, Supplementary Materials). This cohort included malignant and adjacent non-malignant tissue from patients mostly with squamous cell carcinomas (SCCs), stages I-III. Samples were obtained and snap-frozen at −80 • C until use.
Bronchoalveolar lavages (BAL) included 79 samples from patients with lung cancer and 26 tumor-free controls obtained from CIMA-CUN. Stage I-IV adenocarcinomas (ADCs), SCCs, and few cases from other histological types were included in this cohort (Table S2, Supplementary Materials). Upon collection, samples were centrifuged and stored in cryovials at −80 • C until use. In available cases, an aliquot of each sample was smeared on a slide to determine the presence of malignant cells.
Plasma samples were obtained from the University of Navarra Biobank, 89 of which corresponded to patients with NSCLC and 25 to healthy individuals. This cohort included ADC and SCC from stage I-IV NSCLC patients (Table S3, Supplementary Materials). Plasmas were processed within 2 h of blood extraction and frozen at −80 • C. Studies were approved by the Ethical and Scientific Committee of the University of Navarra with written informed consent from the patients (project number: 13/2016).
The characteristics of control individuals and patients for the study of BAL and plasma samples were similar between both groups in terms of age, sex, and smoking habits.

Evaluation of TMPRSS4 Expression by Immunohistochemistry
For immunohistochemistry, tissue microarrays (TMAs) containing three representative tissue cores per case were built. Slides were deparaffinized and rehydrated. Endogenous peroxidase was blocked with 3% hydrogen peroxide, and antigen retrieval was performed by heating the samples in a microwave oven using citrate buffer (10 mM, pH 6). After incubation with the primary antibody overnight at 4 • C with a previously validated anti-TMPRSS4 antibody from Ingenasa Inc., (1:500) [20], the Advance™ HRP system (Dako, Glostrup, Denmark) was used for detection of the signal. Finally, slides were counterstained with hematoxylin, dehydrated, and cover-slipped with DPX mounting medium (VWR, Barcelona, Spain). Slides were scanned with the Aperio CS2 scanner (Leica, Barcelona, Spain) at 20× magnification, and images were visualized with the Aperio Image Scope (v12.1.05029). Scores were established by semiquantitative analysis as previously described [21]. Briefly, staining was evaluated by two observers independently (M.J.P. and C.S.) unaware of the clinical features of patients. The extension was scored as percentage of positive cells (0-100%) and the intensity of staining (1+, weak; 2+, moderate; 3+, strong). An H-score was established for each patient using extension and intensity parameters. Both the median and quartiles were tested as cut-off values to define high/low TMPRSS4 expression levels.

DNA Isolation and Bisulfite Conversion
DNA from tissues and cells was isolated with the NucleoSpin TissueTM kit (Macherey-Nagel). We used for each sample five slides with 3-µm-thick paraffin-embedded tissue. For BAL and plasma, a 1-mL sample from each patient was processed, and DNA was isolated with the QiaAmp Circulating Nucleic Acid (Qiagen, Germantown, MD, USA) and the QiaAmp DNA Blood (Qiagen), following the protocols provided by the kits. The bisulfite conversion was carried out with the EZ-96 DNA Methylation-Lightning™ Kit (Zymo Research, Irvine, CA, USA). The ddPCR reactions were performed within 48 h post bisulfite conversion to reduce the possibility of DNA degradation.

Digital Droplet PCR to Detect the Methylation Status of TMPRSS4 and SHOX2
For the ddPCR, specific probes to identify either the methylated (labeled with FAM) or the unmethylated (labeled with HEX) CpGs were synthesized (Table S4, Supplementary Materials). Probes were 20-24 nt long, contained the CpG of interest in the middle of the sequence, and were devoid of SNPs. Primers flanking the probes were common for the methylated/unmethylated sequences and did not contain any CpG. For TMPRSS4, the CpG located at −70 bp upstream of the transcription start site (cg25116503 probe from the Infinium 450k methylation array) was evaluated. This CpG was selected based on our previous study showing strong hypomethylation in NCSLC specimens in comparison with non-malignant lungs [7]. In the case of SHOX2, the selected CpG was based on previous publications that used qMSP-PCR in plasma to differentiate healthy individuals from patients with NSCLC [18]. Primers and probes for ddPCR were designed according to Bio-Rad recommendations (http://www.bio-rad.com).
The QX200™ Droplet Generator (Bio-Rad, Hercules, CA, USA) was used prior to DNA amplification with the following conditions: 95 • C for 10 min; 40 cycles of 94 • C for 30 s and 52 • C for 1 min; 98 • C for 10 min. The optimal annealing temperature was selected after performing a temperature gradient assay for each primer/probe set for each gene. DNA amplification was carried out in a C100 Touch™ (Bio-Rad) thermocycler. After the PCR, the QantasoftTM software (Bio-Rad) was used for the analysis, using the RED (rare event detection) option. Samples that did not reach 10,000 events per well were discarded.

Statistical Analyses
Normality of the data was assessed with the Shapiro-Wilk test. The association between TMPRSS4 expression and clinicopathological features of patients was analyzed by Pearson's chi-square test. Relapse-free survival (RFS) and overall survival (OS), defined as the time from the date of surgery to the date of recurrence or death, respectively, were evaluated with Kaplan-Meier curves, and significant differences among groups were assessed by the log-rank test. For survival analyses, the follow-up period was restricted to 100 months. To evaluate the prognostic value of TMPRSS4, univariable and multivariable Cox proportional hazard analyses were used. Only those variables with p ≤ 0.1 in the univariable analysis were included in the multivariable analysis.
To compare levels of TMPRSS4 and SHOX2 DNA promoter methylation in plasma and BAL samples from normal individuals vs. patients with NSCLC, the Mann-Whitney U test was used. Receiving operating characteristics (ROC) curves were generated to evaluate the diagnostic ability of the biomarkers. The Youden index was used to find out the optimal cut-off values in the ROC curves and select sensitivity and specificity values. Logistic regression was used to estimate the combined diagnostic potential of both TMPRSS4 and SHOX2 in BAL and plasma samples. Statistical analyses were performed with SPSS15.0 (Madrid, Spain), STATA/IC 12.1 (College Station, TX, USA), and GraphPad Prism 5 (San Diego, CA, USA) software. Statistical significance was defined as p < 0.05 (*), p < 0.01 (**), and p < 0.001 (***).

Prognostic Value of TMPRSS4 Protein Expression in NSCLC
TMPRSS4 protein expression was firstly examined in relation to survival. The median H-score was used to categorize patients in high vs. low TMPRSS4 tumor expression. Kaplan-Meier survival curves and log-rank tests considering all NSCLC stages showed that TMPRSS4 levels above the median were significantly associated with both reduced RFS (p = 0.004) and OS (p = 0.01) in the MDA cohort (489 cases) ( Figure 1A,B). Similar results were obtained in the CIMA-CUN cohort (n = 95 cases); however, in this case, a tendency toward significance (p = 0.12) for RFS and a marginally significant difference (p = 0.0499) for OS were observed ( Figure S1A highest H-score. Figure 1C,D show an improvement in stratification, where patients with an H-score within Q4 had a much worse outcome (p < 0.001 for both RFS and OS) than those included in Q1-Q3. Therefore, this dichotomization was able to better define patients at risk of relapse and death. Such a result was not found in the CIMA-CUN cohort, likely due to limited statistical power. Relationships between clinicopathological variables and TMPRSS4 expression considering Q4 as the cut-off in the MDA cohort are shown in Table S6 (Supplementary Materials). Patients with levels above the median showed worse prognosis. When patients were stratified by quartiles, prognostic differences between both groups were larger: patients with levels above the top 25% (Q4) were associated with lower RFS (p < 0.001) (C) and OS (p < 0.001) (D). RFS (E) and OS (F) comparing stage IA and IB patients. (G) RFS analysis in stage IA and IB patients upon stratification by TMPRSS4 protein expression. The top 25% protein expression (Q4) was considered as high level. In the case of stage IA, TMPRSS4 levels in Q4 were significantly associated with lower RFS (p = 0.002) as compared to patients within the same stage where TMPRSS4 levels were low. In the case of stage IB, the same tendency was observed for RFS, but results were not statistically different. (H) Evaluation of OS rendered similar results to those found for RFS; n = 187 stage IA, n = 95 stage IB.
We observed in the TMAs some cases with a very high TMPRSS4 H-score and wondered whether prognosis could be predicted more accurately based on high expression within the top 25% H-score. Therefore, patients were stratified in quartiles, with Q4 representing the one with the highest H-score. Figure 1C,D show an improvement in stratification, where patients with an H-score within Q4 had a much worse outcome (p < 0.001 for both RFS and OS) than those included in Q1-Q3. Therefore, this dichotomization was able to better define patients at risk of relapse and death. Such a result was not found in the CIMA-CUN cohort, likely due to limited statistical power. Relationships between clinicopathological variables and TMPRSS4 expression considering Q4 as the cut-off in the MDA cohort are shown in Table S6 (Supplementary Materials).
When the MDA cohort was separated by histologies, in both ADC and SCC, levels over Q4 were also very significantly associated with worse outcome, except for OS in the case of SCC, where only a non-significant trend was observed ( Figure S2A-D, Supplementary Materials). Similar results were obtained when the median was considered as a cut-off, but p-values were inferior to those found for the top quartile ( Figure S2E-H, Supplementary Materials).
To further evaluate the prognostic significance of TMPRSS4 expression, we used univariable and multivariable Cox proportional hazards analysis using Q4 as the cut-off in the MDA cohort. In univariable analysis (Table S7, Supplementary Materials), patients with high TMPRSS4 levels also showed worse RFS (HR, 2.09 (95% CI 1.53-2.87), p < 0.001) and OS (HR, 1.82 (95% CI, 1.38-2.41), p < 0.001). For the multivariable analysis, we considered variables whose p-values were significant or close to significance (p ≤ 0.1) in the univariable test. Results shown in Table 2 reveal that TMPRSS4 is an independent prognostic maker of RFS (HR, 1.82, (95% CI, 1.28-2.60), p = 0.001) and OS (HR, 1.44, (95% CI, 1.07-1.94), p = 0.014). We then focused on stages IA/IB, as molecular biomarkers might help the stratification of patients at risk of relapse/death and guide clinical management. Clinicopathological characteristics of this subgroup of patients can be found in Table S8 (Supplementary Materials). In the eighth TNM classification of NSCLC, prognosis for stage IB was found not to differ from that of stage IA [3]. In agreement with these results, no differences between both stages were observed for either RFS (p = 0.27) or OS (p = 0.57) in our cohort of patients (MDA, n = 187 stage IA; n = 95 stage IB) ( Figure 1E,F). However, when considering the protein expression of TMPRSS4 using Q4 as the threshold, we were able to substratify stage IA patients, since those with high TMPRSS4 levels showed a very significantly reduced RFS (p = 0.002) and OS (p < 0.001) ( Figure 1G,H). A similar tendency was observed for stage IB, although statistical differences were not found ( Figure 1G,H). Univariable (not shown) and multivariable (Table 3) analysis considering stages IA/B also verified that TMPRSS4 was an independent prognostic factor in this early stage (RFS HR, 2.42 (95% CI, 1.47-3.99), p < 0.001; OS HR, 1.99 (95% CI, 1.25-3.16), p = 0.004). We also performed this analysis using quartiles in stages II and III-IV and found no significant differences based on TMPRSS4 expression (not shown). Therefore, we conclude that TMPRSS4 is an independent prognostic marker for NSCLC, especially for very early stages, where it can significantly differentiate patients with a more aggressive disease.

Development of TMPRSS4 and SHOX2 Methylation Assays by ddPCR
Our previous study using pyrosequencing and DNA methylation arrays found significant TMPRSS4 promoter hypomethylation of certain CpGs in tumors from NSCLC patients as compared to non-malignant lung specimens [7]. Therefore, our next goal was to evaluate whether TMPRSS4 methylation status could be quantified in liquid biopsy by ddPCR to assess differences between controls and NSCLC patients. To this aim, we set up experimental conditions to evaluate one of these CpGs (cg25116503) within the TMPRSS4 promoter by ddPCR. In parallel, we developed a ddPCR assay for SHOX2, a validated epigenetic diagnostic biomarker for lung cancer. To select the promoter region that differentiates between controls and patients and to design the ddPCR primers, we took into consideration previous reports on SHOX2 methylation in NSCLC that were conducted with other analytical methods [15,17].
As controls for the TMPRSS4 assay, we selected the following cell lines: H2170, with hypomethylated cg25116503 and high TMPRSS4 expression, and H1703, with hypermethylated cg25116503 and low TMPRSS4 expression, based on our previous analysis using the 450k methylation array. We firstly used 500 ng of DNA from these cell lines and quantified the absolute number of methylated and unmethylated copies of cg25116503. Figure 2A shows representative two-dimensional (2D) Quantasoft images in H1703 and H2170, providing evidence that both methylated and unmethylated cg25116503 were accurately amplified in agreement with the expected pattern. The percentage of methylation was highly coincident between the 450K methylation array and the ddPCR ( Figure 2B).
Taking into account that DNA concentration in plasma and BAL is low and that bisulfite treatment damages the DNA, we performed ddPCR assays using different starting amounts of DNA for conversion and different amounts of converted DNA to load into the ddPCR wells: 500/50 ng; 200/50 ng; 100/40 ng; 50/20 ng; 25/10 ng; 10/4 ng ( Figure 2C). We found that, with the use of an initial DNA amount of 50 ng for conversion and a loading amount of 20 ng, methylation levels were accurately measured, and the expected percentage of methylated/unmethylated copies was maintained ( Figure 2C). In parallel, we also did dilution experiments in which, starting from 50 ng and upon conversion, we loaded 50, 25, 10, 5, 1, 0.5, and 0.05 ng per well in each ddPCR reaction. As expected, a concentration-dependent decrease in number of copies/µL was found for both methylated and unmethylated CpGs ( Figure 2D,E).
Then, using the combination 50/20 ng DNA, we quantified by ddPCR a panel of 46 lung cancer cell lines in which we previously determined the methylation status of TMPRSS4 by 450k methylation arrays. As shown in Figure 2F, the methylation status was highly concordant (r = 0.90, p < 0.001) between both techniques. In agreement with our previous findings [7], TMPRSS4 methylation levels (determined by either the 450K methylation array or ddPCR) were inversely correlated with TMPRSS4 messenger RNA (mRNA) levels (p < 0.001, Figure 2G).
The next step was to quantify the methylation status of the TMPRSS4 promoter by ddPCR in tumor samples from NSCLC patients, in order to validate our previous results using pyrosequencing and 450k arrays, which showed hypomethylation of cg25116503 in tumors. In the HGUV cohort, comprising 59 stage I-III tumors and their matched non-malignant samples, a significant hypomethylation (p < 0.001) of cg25116503 was found in tumors. Representative 2D Quantasoft images for non-malignant and malignant samples are shown in Figure 3A, and average values in patients are shown in Figure 3B. The area under the ROC (AUROC) curve was 0.73 (95% CI 0.63-0.83; p < 0.001) ( Figure 3C).
Then, using the combination 50/20 ng DNA, we quantified by ddPCR a panel of 46 lung cancer cell lines in which we previously determined the methylation status of TMPRSS4 by 450k methylation arrays. As shown in Figure 2F, the methylation status was highly concordant (r = 0.90, p < 0.001) between both techniques. In agreement with our previous findings [7], TMPRSS4 methylation levels (determined by either the 450K methylation array or ddPCR) were inversely correlated with TMPRSS4 messenger RNA (mRNA) levels (p < 0.001, Figure 2G). The same optimization assays were performed for SHOX2 using control cells with high (H2170, 98%), medium (COR-L88, 51%), and low (LXF-289, 37%) methylation levels, based on data from the 450k array (not shown).
The next step was to quantify the methylation status of the TMPRSS4 promoter by ddPCR in tumor samples from NSCLC patients, in order to validate our previous results using

Diagnostic Potential of TMPRSS4 and SHOX2 Methylation in Liquid Biopsy Evaluated by ddPCR
To evaluate the performance of the ddPCR in liquid biopsies, we firstly used BAL samples from a cohort of 79 NSCLC patients and 26 controls (tumor-free). Significant hypomethylation (p < 0.01) was found for TMPRSS4 in the case of patients with early stage (I-II) NSCLC in comparison with controls ( Figure 4A), with an AUROC of 0.72 ((95% CI, 0.57-0.87), p = 0.008) ( Figure 4B). The maximum Youden index (maxYouden) was 0.43, with a specificity (SP) and sensitivity (SE) for TMPRSS4 methylation (TMPRSS4 meth ) status in early stage tumors of 91% and 52%, respectively. Considering all stages (I-IV), no significant differences were observed between controls and NSCLC patients ( Figure 4A). The value of AUROC in this case was 0.59 ((95% CI, 0.47-0.71), p = 0.16) ( Figure  4C).

Diagnostic Potential of TMPRSS4 and SHOX2 Methylation in Liquid Biopsy Evaluated by ddPCR
To evaluate the performance of the ddPCR in liquid biopsies, we firstly used BAL samples from a cohort of 79 NSCLC patients and 26 controls (tumor-free). Significant hypomethylation (p < 0.01) was found for TMPRSS4 in the case of patients with early stage (I-II) NSCLC in comparison with controls ( Figure 4A), with an AUROC of 0.72 ((95% CI, 0.57-0.87), p = 0.008) ( Figure 4B). The maximum Youden index (maxYouden) was 0.43, with a specificity (SP) and sensitivity (SE) for TMPRSS4 methylation (TMPRSS4 meth ) status in early stage tumors of 91% and 52%, respectively. Considering all stages (I-IV), no significant differences were observed between controls and NSCLC patients ( Figure 4A). The value of AUROC in this case was 0.59 ((95% CI, 0.47-0.71), p = 0.16) ( Figure 4C).
We next used plasma from 89 patients with NSCLC and 25 tumor-free individuals (controls) for the study of both TMPRSS4 and SHOX2 methylation by ddPCR (1 mL for each gene). In early stages, a significant hypomethylation was found for TMPRSS4 (p < 0.05) ( Figure 5A), with an AUROC of 0.73 ((95% CI, 0.54-0.90), p = 0.015) ( Figure 5B). The maxYouden index was 0.55, and SP and SE were 65% and 90%, respectively. Considering late stages (III-IV), no statistical differences between A very significant inverse correlation between TMPRSS4 methylation status and SHOX2 methylation status was found in the group of early stages (r = −0.82; p < 0.001) and in all stages (r = −0.73; p < 0.001) ( Figure S3, Supplementary Materials).
Using logistic regression, we assessed whether a combination of both biomarkers (TMPRSS4 methylation and SHOX2 methylation) would increase the diagnostic potential. In the case of BAL, the combined index resulted in a higher AUROC of 0.76 (95% CI, 0.62-0.86), but without statistical differences with each marker alone: p = 0.07 when compared to that of TMPRSS4 methylation and p = 0.08 when compared to that of SHOX2 methylation. In the case of plasma samples, no improvement was found (not shown).

Discussion
Due to the molecular heterogeneity of NSCLC, some tumors show a particularly malignant phenotype within the same TNM stage and are associated with poor prognosis. Accurate characterization of tumor stage is critical to predict prognosis and to help selecting the optimal treatment options, especially when decisions need to be made in relation to adjuvance in early stages. Therefore, molecular biomarkers that may identify such malignant phenotypes could help in the risk stratification. The latest TNM classification (eighth edition) introduced changes that define prognosis in a more refined way. Our present study about the prognostic role of TMPRSS4 in NSCLC using the eighth TNM classification was based on previous results from our group in a cohort of n = 79 patients using the seventh TNM edition, which suggested TMPRSS4 protein expression as a prognostic indicator [7]. Now, we demonstrate here, in a large cohort of patients, that TMPRSS4 protein expression is an independent prognostic indicator, especially for very early stages, which can significantly differentiate individuals with a more aggressive disease.
In agreement with results from the eighth TNM classification of NSCLC [3], our study shows that prognosis of stages IA and IB does not differ. Although adjuvant chemotherapy is routinely performed in stages II-III, its use in stage IA is detrimental. Regarding stage IB, there is still controversy, as some studies described beneficial effects [22] while others found the opposite results [23][24][25]. Therefore, molecular biomarkers that may distinguish patients at high risk of recurrence within these early stages could contribute to clinical decisions, although such biomarkers are currently lacking. Our study shows that prognosis in patients with stage IA TMPRSS4 high is significantly worse than that of patients with stage IA TMPRSS4 low and overlaps with that found in patients with stage IB TMPRSS4 low . Thus, high expression of TMPRSS4 in stage IA may uncover a subgroup of patients with a more malignant phenotype. In the future, upon extended implementation of low-dose CT-based screening programs, the proportion and total numbers of stage I-II NSCLC cases diagnosed will increase. Thus, the need for new and robust tools for indeterminate nodule characterization and for early tumor prognostic classification will be more compelling [26].
As a hypothesis, we can speculate that the malignant phenotype in TMPRSS4 high tumors would be related to the acquisition of epithelial-to-mesenchymal transition (EMT) and cancer stem cell (CSC) characteristics early during carcinogenesis, which could induce their escape from the primary tumor. We previously described that TMPRSS4 overexpression promotes metastatic dissemination and resistance to chemotherapy in experimental models [5,27]. In agreement with these results, downregulation of TMPRSS4 highly sensitizes lung cancer cells to chemotherapy, including cisplatin and paclitaxel [5]. Therefore, TMPRSS4 deserves further consideration as a protumorigenic target and a prognostic biomarker in early stage NSCLC.
One clinically unmet need is the identification of biomarkers to follow up early-stage lung cancer patients who, after undergoing surgery, will relapse. In line with the prognostic role of TMPRSS4, we sought to develop a non-invasive method using highly sensitive techniques to differentiate between controls and NSCLC patients. Because TMPRSS4 expression in NSCLC is associated with hypomethylation of the DNA promoter and reduced methylation is related to poor outcome [7], we decided to investigate whether TMPRSS4 methylation status could constitute a biomarker of malignancy in liquid biopsy. Epigenetic changes occur soon during tumor development and, thus, they give promise as diagnostic biomarkers for early stages [28]. Moreover, DNA methylation is a covalent and stable modification that can be detected in circulating free DNA. In the clinic, the most relevant epigenetic marker is O-6-Methylguanine-DNA Methyltransferase (MGMT) as a predictor of response to chemotherapy in glioblastoma [29]. In the case of lung cancer, several studies showed the diagnostic potential of SHOX2, which was commercialized (www.epigenomics.com).
Conventional methylation-specific PCR methods are commonly used to quantify the DNA methylation status of different genes, but these methods show some shortcomings for their clinical use. Firstly, they require internal controls for normalization, and sensitivity is in many cases insufficient when analyzing low amounts of DNA (as is the case for plasma and BAL). However, ddPCR does not depend on the presence of internal calibration curves and provides an absolute copy number. Moreover, ddPCR can quantify low-abundance nucleic acids with much higher sensitivity than conventional PCR and, therefore, it is a more suitable method to evaluate DNA methylation changes in liquid biopsy.
In our study, we proved that methylation status of TMPRSS4 and SHOX2 can be accurately quantified by ddPCR in both plasma and BAL from NSCLC patients and healthy individuals. The AUROC curves for TMPRSS4 in plasma and BAL from early-stage (I-II) NSCLC and controls were 0.73 and 0.72, respectively. These data are similar to the values we found for tumors (AUROC = 0.73) and suggest that DNA methylation changes occurring within the tumor are also reflected in liquid biopsies. It is worth noting that the methylation status of TMPRSS4 was inversely correlated with the methylation status of SHOX2 in BAL but not in plasma. In our study, hypermethylation of SHOX2 showed lower diagnostic power than the one reported by other studies for plasma or BAL: AUROC curves within the range of 0.78 to 0.91 [15][16][17]. This could be related to the large number of patients used in those studies or due to methodological differences.
Because of tumor heterogeneity, it is likely that combination of different epigenetic markers improves the diagnostic capability, although this was not the case in our study when combining TMPRSS4 and SHOX2. A report combining SHOX2 and PTEGR4 (Prostaglandin E receptor 4) in plasma using several cohorts of patients obtained AUROC curves between 0.88 and 0.98 [30]. An epigenetic classifier including four genes (HOXA9, RASSF1A, SOX17, and TAC1) that was evaluated in sputum by ddPCR reached an AUROC of 0.92 for lung cancer detection [31]. Another classifier that included genes BCAT1, CDO1, TRIM58, and ZNF177 was analyzed using a logistic regression model in bronchial fluids, with a combined AUROC value of 0.91 [32].
Cytological examination of BAL is an important diagnostic method in NSCLC. Nonetheless, the sensitivity of this method is modest, with values within 45-50% reported in many studies (reviewed by Reference [33]). Some papers showed that changes in DNA methylation of different genes have a better diagnostic potential than cytology or can be used in combination with cytological analysis to increase sensitivity, although those results are in need of validation. For example, Isle et al. [34] showed that SHOX2 hypermethylation aids cytology in NSCLC diagnosis. Also, the four-gene classifier mentioned above [32] improved diagnosis by cytological examination. Therefore, analysis of aberrant methylation in ct free DNA/exfoliated tumor cells found in BAL may help to increase the sensitivity of cytology. A future prospective study combining TMPRSS4 methylation status and cytology could validate the role of this potential epigenetic biomarker in BAL samples.
Determination of methylation status of TMPRSS4 (TMPRSS4 meth status) in blood may be useful in a clinical scenario where a surgically treated patient is at risk based on both stage and high TMPRSS4 tumor expression. The subsequent liquid biopsies based on the ddPCR measurement of TMPRSS4 meth status in blood could be used as an auxiliary test to follow the patient's evolution, eventually suggesting the presence of a relapsed tumor. Further studies carried out in intended-to-use cohorts (in screening or prognostic settings) may clarify whether this new liquid biopsy technology is clinically validated in diagnostic or prognostic trials.

Conclusions
We conclude from our study that TMPRSS4 is an independent prognostic indicator of reduced RFS and OS in early NSCLC, and that TMPRSS4 meth status differentiates between tumor-free individuals and those with NSCLC. Our data support the use of TMPRSS4 as an indicator of malignancy in early-stage NSCLC.
Supplementary Materials: The following are available online at http://www.mdpi.com/2077-0383/8/12/2134/s1. Figure S1: Prognostic value of TMPRSS4 including patients from all stages (I-IV) and early stages (I-II) in the CIMA-CUN cohort, and early stages (I-II) in the MDA cohort. RFS (A) and OS (B) in patients from CIMA-CUN stratified by the median value of TMPRSS4 protein expression. Differences between good/bad prognosis was clearly appreciated in stage I-II patients: p = 0.006 for RFS (C) and p < 0.001 for OS (D). RFS (E) and OS (F) for stages I-II from the MDA cohort stratified by the median.; Figure S2 Figure S3: Inverse correlation between TMPRSS4 methylation status and SHOX2 methylation status in BAL (A,B) and plasma (C,D).; Table S1: Cohort of patients from Hospital General Universitario de Valencia (HGUV) used to study the methylation status of TMPRSS4 and SHOX2 in tumors and matched non-malignant lung samples by ddPCR; Table S2: Cohort of patients used to quantify methylation status of the TMPRSS4 and SHOX2 promoters in bronchoalveolar lavages (BAL) from NSCLC patients; Table S3: Cohort of patients used to quantify methylation status of the TMPRSS4 and SHOX2 promoters in plasmas from NSCLC patients; Table S4: Sequence of primers and probes for ddPCR analysis; Table S5: Relationship between TMPRSS4 protein levels and clinicopathological characteristics of the patients from the MDA and CIMA-CUN cohorts; Table S6: Relationship between TMPRSS4 protein levels and clinicopathological characteristics of the patients from the MDA cohort using the top quartile (Q4) as the cut-off value; Table S7: Univariable Cox proportional hazards analysis of TMPRSS4 protein expression and RFS or OS in the MDA cohort; Table S8: Clinicopathological characteristics of stage IA/IB NSCLC patients from the MDA cohort that were used to study the prognostic value of TMPRSS4. F.E. was funded by "Asociación de Amigos de la Universidad de Navarra" in association with "La Caixa" Banking Foundation.