Thyroid cancer incidence has increased in recent years. The estimated morbidity is 15.8 persons per 100,000 population (8.0 per 100,000 men and 23.3 per 100,000 women), which represents 3.0% of all new cancer cases diagnosed in the United States [1
]. Better availability of sonography leads to the detection of small lesions, which are mostly clinically asymptomatic [2
]. The most common type—papillary thyroid carcinoma (PTC)—constitutes 65%−93% of all thyroid cancer cases, depending on the analyzed population [3
]. Five-year overall survival for thyroid cancer is 98.2% [1
]. Despite the increased number of new cases, the number of deaths remains stable, and it is 0.5 per 100,000 population.
Differentiated thyroid cancer (DTC) is characterized by an excellent prognosis, with 10-year survival rates exceeding 90%. Nevertheless, 3–15% of DTC patients have distant metastases (M1) at presentation [4
], whereas recurrent disease is diagnosed within decades in up to 30% of patients [6
]. Two-third of relapses (66%) occur within the first ten years. Among them, 79% of cases have a locoregional disease, whereas 21% of cases manifest as distant metastases, mostly in the lungs (53–63%) and in the bones (19–20%). Numerous risk score systems have been developed to predict the course of the disease. These classifications are based on different clinical and histopathological data, including age at diagnosis, tumor diameter, cancer type and subtype, invasion outside thyroid capsule, lymph node, and distant metastases. The most common are TNM (tumor, nodes, metastasis) system, MACIS (Metastases, Age, Completeness of resection, Invasion [local], Size) score, and ATA (American Thyroid Association) Initial Risk Stratification System [7
]. Surprisingly, none of these systems consider molecular features. The prognostic significance of BRAFV600E
promoter mutations in PTC has been widely discussed recently [9
]. The BRAFV600E
mutation is the most common molecular alteration in PTC, being present in 36%-83% of PTC cases. Its impact on the PTC course is not so unequivocal. Numerous papers have pointed to a significant association between the BRAFV600E
mutation and other poor prognostic factors, including older age, male gender, tumor size, extrathyroidal extension, lymph node or distant metastases, higher PTC stage, and the risk of recurrence or PTC-related death [13
]. However, other studies did not confirm these findings [9
promoter mutations, reported for the first time in thyroid carcinoma in 2013 [20
], are rare and occur in 7.5% of PTCs [21
]. The coexistence of TERT
promoter and BRAF
mutation, frequently observed in PTC, is considered as being related to poorer prognosis and a more aggressive PTC course [22
]. Noteworthily, some published data indicate that a negative impact on PTC prognosis is a consequence of TERT
mutations solely [26
] than their combination with the BRAF
mutation. In contrast, other sources demonstrate that the effect of TERT
mutations decreased or disappeared when these two mutations occurred separately [22
]. Following these data, the assessment of the BRAFV600E
mutation or TERT
promoter mutation is currently being implemented into clinics, although it is still not a part of daily clinical practice. However, one should remember that the confirmation of the presence of BRAF
mutation does not allow one to select patients with a high risk of distant metastases accurately and so far, does not influence clinical management either.
In contrast to our knowledge regarding the impact of mutations, as discussed above, the data linking gene expression profile and the risk of PTC spread are scarce and are mostly directed at the risk of nodal rather than distant metastases [28
]. As nodal metastases do not preclude excellent prognosis, these attempts did not translate into clinically useful predictor. Previous studies to find at least one predictive gene expression marker of distant metastases in PTC failed. We believe we are justified to draw such a conclusion, as a small number or even no papers continuing these analyses were published.
Another important issue, showing huge progress in recent years, is the analysis of molecular pathways, bringing new light to the molecular pathogenesis of metastasis [32
]. Such analyses were also carried out in thyroid carcinoma [33
]. It is believed that understanding the interplay and deregulation in molecular pathways may be crucial for the development of new therapeutic strategies and finally lead to the improvement of long-term prognosis.
In this study, we verify a hypothesis that there exist gene expression markers in primary thyroid tumors which cause a predisposition to the occurrence of distant metastases (synchronously or metachronically), which could potentially serve as a valuable prognostic/predictive marker. Thus, in the analysis, we attempt to find differentiating genes to predict M1 in PTC.
In our study, we undertook a step-by-step analysis to find differences between M0 and M1 papillary thyroid carcinomas. As several platforms were used, with significant heterogeneity within the data obtained, at this stage, we did not decide to perform formal meta-analysis (technical aspects discussed later on). We did so also to deliberately reveal the process of the collection of data and step approach and avoid potential bias in data analysis related to our several previous attempts.
Multigene expression signatures were characterized for different neoplasms, aiming to derive clinically meaningful classifiers. However, this has been successfully achieved only in selected malignancies. Breast cancer is the most prominent example: multigene signatures show a prognostic significance. They are partially used as a predictive marker to select the patients who benefit from adjuvant chemotherapy [35
]. The prognostic aspect of breast cancer survival analysis was, in the majority of datasets, based on the prediction of distant metastasis—since the very beginning, this approach has been highly successful [36
]. Based on these findings, we believed a similar multigene expression signature, discriminating patients at high risk of distant metastases from low-risk individuals, could be derived from thyroid cancer primary tumors. Thereby, we decided to collect prospectively the postoperative material of patients operated at our hospital. As our institute is a tertiary reference center for thyroid carcinoma, the number of patients with metastatic disease is relatively high. In the current study, we analyzed more than a hundred patients, including 35% of patients with metastatic disease. Historically, the first group of 15 PTC patients who developed distant metastases was compared to 56 M0 patients (dataset A). The identified differentially expressed genes were validated on a group of 21 M1 and 31 M0 patients (dataset B), collected later. Importantly, within the group of metastatic PTCs, there were both patients with distant metastases at initial staging and individuals who developed late metastases in the course of the follow-up.
Our first analyses did not point to any essential differences between non-metastatic and metastatic PTCs. Initially, we considered a too short time of follow-up in a group of non-metastatic patients a major obstacle for successful analysis (considering missing cases with metastases developing further in the course of the disease). Having a relatively low number of specimens, we decided to wait for a longer follow-up time and collect further samples. Our analysis involved nine additional patients in whom metastases were diagnosed later than one year after primary treatment (up to 60 months after). Finally, in our opinion, the follow-up was long enough to select M0 patients correctly. However, almost doubling the population and lengthening of the follow-up did not result in a spectacular increase in the magnitude of observed differences.
Due to the long time necessary, on the one hand, to collect a sufficient number of metastatic PTC patients, and, on the other hand, to correctly classify M0 cases, we were forced to apply different generations of microarrays during the study. Data were analyzed using linear models for microarray data (limma), a method developed and well-tailored to multiple comparisons scenarios in a genomic setting, with the batch of microarrays included as a variable in a model. However, we observed a kind of double failure—first, only 11 genes were deemed significant (including IGFBP3
a relatively low number to underline a biologically sound difference; second, one of the genes was earlier validated by an independent method on an independent dataset and was not significant (qPCR on an independent validation set C of PTC samples). We decided to abandon a plan of validating other genes by qPCR and carried out microarray profiling, which led to the confirmation of only two mentioned transcripts. At the moment, we consider these results as negative and not confirming our hypothesis of differences between M1 and M0 PTCs. We want to emphasize that our qPCR group (set C) reflected a more real frequency of distant metastases in PTC (it involved nine M1 patients among 63 cases analyzed (14%) whereas, in initial discovery datasets, we accumulated a larger number of M1 patients. Thus, a lack of differences in the IGFBP3
gene by qPCR could be impacted by the sample size but provides clear information that the magnitude of change in this single marker gene would not be sufficient in routine clinical practice. Nevertheless, in our opinion, this gene is potentially important and deserves further evaluation, as the role of the IGF system in thyroid cancer has been discussed for a long time [39
]. Regarding the ECM1
gene, it was reported not only in thyroid cancer but also in other solid tumors. Kebebew et al. analyzed its diagnostic value in thyroid carcinoma, pointed out also for potential association with disease extent [41
Regarding the data presented above, our initial hypothesis did not seem to be justifiable. We were able to find a subtle difference between non-metastatic and metastatic PTC only, and we were unable to confirm them by independent methods. We did not find a characteristic gene expression profile, typical for metastatic PTC, although our group analyzed by microarray the largest number of metastatic PTCs (36 cases), as already published. The major potential reason for this is the low frequency of PTC metastatic spread in the general population. It seems evident we should discuss essential data provided by The Cancer Genome Atlas (TCGA) study. This study included 496 PTC samples, among them eight metastatic cases [42
]. Thus, the number of analyzed metastatic PTCs was very low (1.6%), providing insight into a relatively indolent tumor population. We believe our data may constitute an addendum to this analysis. We are able to carry out a meta-analysis of our three genomic datasets and TCGA data [12
]. We are intensively seeking a validation dataset for such an analysis; we are very open for a collaborative approach (contact: [email protected]
). However, planning a study would require at least a doubling of the number of metastatic patients to derive adequate power and validation ability. In the larger dataset, a more sophisticated bioinformatic approach, including machine learning, is necessary to provide adequate multigene discrimination. Our previous experience with thyroid cancer data [43
] showed the feasibility of this approach when applied to differences with normal thyroid tissue or other cancer histotypes. We endorse further approaches to characterize poor prognosis in thyroid cancer, as it is of utmost importance in a clinical setting. Still, we warn against commencing the study without at least hundreds of samples from metastatic patients.
A limitation that could influence our data is also the transcriptomic platform we used. Since the early commencement of our study, oligonucleotide microarrays and the algorithm used by us did not detect, for example, long non-coding RNA (lncRNA). In addition, next-generation RNA sequencing (RNA-Seq) might introduce newer transcripts. Noteworthily, TCGA-based analyses paid our attention to lncRNA, playing an essential role in PTC [45
] and other malignancies [50
]. Numerous papers raised the role of lncRNA in the aggressiveness/invasiveness of thyroid carcinoma [45
]. More data regarding this issue are necessary. We did not apply this approach as some of the metastatic PTCs were fully used up and were not left for any additional transcriptomic experiments; they exist only as a U133 microarray readout. Nevertheless, one should emphasize that the number of published studies based on RNA-seq, carried out in thyroid cancer, is small. The first one, reported in 2013, included only 20 PTCs, among them no one with distant metastases [56
]. Another one was the TCGA study, as described above [42
]. Thus, to our best knowledge, our study includes the largest number of metastatic PTCs studied by the genomic approach. We believe we have the right to claim that the difference in the gene expression profile of PTC primary tumors between metastatic and non-metastatic patients, if any exists, is small and requires a systematic and multi-center approach. Although the number of metastatic PTC samples in our study was relatively high (45) compared to the published data, we believe that dataset size could limit the power of conclusions.
We demonstrated that the signature of high-risk metastatic PTC was not as obvious as we initially believed. It is necessary to inform other groups involved in the research of metastatic thyroid cancer that a putative further study seeking a difference between metastatic and non-metastatic PTC shall involve a larger population and a broad portfolio of molecular methods rather than the raw transcriptomic assessment. Our study was carried out in three subsequent steps. In each step, we extended the population size, and each lengthened the patients’ follow-up. However, it did not lead to conclusive findings. The issue of the publication of negative results is widely raised to limit unnecessary repetitive small size experiments and to promote cooperation between researchers and the meta-analytic approach. We are extensively searching for a partner to carry out a more extensive analysis of PTC transcriptome to provide such conclusions in the future.
The critical issue regarding the occurrence of metastases is related to the host response. It is well known that PTC metastasizes mainly to niches localized in lymph nodes, lungs, and bones. The data characterizing the features of metastatic niches (receptive to colonization by circulating tumor cells (CTC)) are growing [57
]. It is still not known whether this readiness of host cells regarding metastatic colonization is related to any molecular mechanisms. Moreover, one should notice that studies on metastatic niches concern the most common cancers [58
], and they did not result in clinically relevant classifiers. So far, we do not have any data regarding thyroid carcinoma. Regarding the host response, we should also consider an anti-tumor immune response with the presence of immune cells in the specimen. Unfortunately, we do not have complete data regarding tumor immune infiltration or the presence of autoimmune thyroiditis in our material. However, we would like to stress that the requirement of a high percentage of PTC cells confined the number of infiltrated stroma in the analyzed material.
The important question is whether PTCs metastasize due to any mechanism clearly distinguishable in primary tumor gross specimen. We assumed that the invasiveness of the primary tumor regarding the development of distant metastases influences the PTC course, while current data also indicate other mechanisms, including the ability of cancer cells to survive, extravasation, or colonization by CTC [59
]. The CTCs may be a result of a very small primary tumor subclone, indistinguishable in the gross transcriptomic analysis. Although microdissection studies are feasible and were also performed by our group [61
], no study has been performed using this approach in the clinical context. A high-quality RNASeq experiment may indicate the subclonal tumor structure. We are currently collecting tumors for such a study. In previous attempts, we succeeded in deriving novel markers for follicular thyroid tumors, in samples analyzed previously by microarray and qPCR [62
]. It also cannot be excluded that metastatic spread in certain tumors is a stochastic event, or the changes are late and discriminable only when distant metastasis tissue is accessible [64
]. In any of these scenarios, we cannot abandon the approaches to search for molecular predictors, as the clinically known PTC features are not sufficient to predict patient outcome in full [7
]. It is also important when de-escalation of surgical treatment is contemplated or when adjuvant therapy has to be administered.
As the transcript-oriented analysis did not bring reliable results, we also approached the same problem at a gene set level. The hypothesis supporting this analysis is that coordinated changes in gene expression in certain functional gene groups could be detected with a lower number of false negatives. We applied a well-established algorithm of the gene set enrichment analysis. This analysis was carried out in Sample set A. The results were validated in Sample set B. We found at least six highly over-expressed gene sets significant in both datasets, with the expression increased in non-metastatic tumors. However, the vast majority of these genes were clearly associated with immune response. Although one can speculate that some changes come inherently from PTC cells, a more probable explanation is that they are related to lymphocyte infiltration of the primary tumor. The presence of infiltrating lymphocytes in tumor or stroma was recognized earlier in PTC and could be associated with a good prognosis [65
]. However, according to many authors, these conclusions seemed controversial [66
]. In our transcriptome-wide analysis of the PTC gene expression, we have already found a significant proportion of immune-related genes with high variability [43
]. Nevertheless, it seems evident that bulk tumor genomic analyses are not an optimal method to evaluate the role of the immune response in the tumor microenvironment. As described above, we carried out a study of microdissected PTC cells and stroma. Thus, in the future, we plan to validate our results also in an independent dataset. The issue of the prognostic relevance of lymphocytic infiltration in PTC requires further studies, particularly in the era of immunotherapy in oncology [68
]. We previously found in the analysis of ovarian cancer [69
] that genomic studies shall be carried out in either histologically homogenous populations or consider multiparametric covariate analysis covering variability in tumor histology. Obviously, potential immune infiltration in the primary tumor is not limited to lymphocytes. One should stress the role of tumor-associated macrophages, one of the factors potentially related to the high expression of metalloproteases [70
To conclude, the differences in gene expression profile between metastatic and non-metastatic PTC primary tumors, if they exist, are subtle and require studies involving hundreds of tumors. Potential further studies shall take into consideration confounding factors, including the immune infiltration of primary tumors.
4. Materials and Methods
The whole group comprised of 186 PTC patients, among them 141 M0 (without distant metastases) and 45 M1 (with distant metastases) cases (Table 4
). All patients underwent total thyroidectomy, and all but two patients underwent radioiodine therapy. One hundred and twenty-three patients were subjected to microarray analyses using different platforms: 71 patients were subjected to HG-U133A and HG-U133 Plus 2.0 (set A), whereas 52 patients were subjected to Human Gene 1.0 ST Array (set B). Set A involved 14 out of the 16 PTC tumors analyzed in our previous paper [43
]. As analyzed previously [71
], the percentage of cancer cells in the analyzed specimens was higher than 50%. Sixty-three patients constituted an independent validation set C, analyzed using qPCR. The details are presented in Table 4
Microarray analysis involved a group of 123 PTC patients, among them 36 cases with distant metastases. Detailed characteristics of the study group is given in Table 5
Due to essential technical differences regarding the microarray platforms used, the analysis was divided into two parts. Set A included data obtained by the analyzes carried out on the following types of microarrays—HG-U133A and HG-U133 Plus 2.0—whereas validation set B involved data obtained using Human Gene 1.0 ST microarrays.
The dataset A was comprised of 71 PTC patients (Table 6
) at the median age of 33 years (range: 5–76 years). The median follow-up was 160 months (range: 3.6–250.8). The majority of patients were women—55 cases (77.5%). Nearly 2/3 of patients (64.8%) were diagnosed with a classic variant of PTC. T1 and T2 features were present in 57.7% and 14.1% of patients, respectively. Lymph node metastases in the central neck compartment were observed in 24 cases (43.6%), whereas metastases in the lateral neck compartment or the upper mediastinum were observed in 29 cases (40.8%). Distant metastases developed during the course of the disease in 15 patients (21.1%) (Table 7
). The diagnosis of metastases was usually stated nearly five months following the primary PTC diagnosis, range from 0 (distant metastases present at diagnosis) to 60 months, based on a post-therapeutic whole-body scan, X-ray and/or CT scan, biopsy, or histopathological examination. Almost all M1 patients had lung metastases (14 out of 15 cases), 3/15 had bone metastases (vertebra or other bones), and one patient (1/15) had liver metastases. Ten out of fifteen metastases demonstrated radioiodine avidity. Complete remission (excellent treatment response according to the ATA criteria) was achieved in 1/3 patients, whereas in the remaining cases, ATA incomplete structural or indeterminate responses were confirmed. Six patients died due to thyroid cancer.
The validation dataset B included 52 PTC cases (Table 6
), mainly women (61.5%), with a median age at PTC diagnosis of 47.5 years (range 17–86), with the median follow-up of 103.8 months (range 0–156.0). Classic PTC was a predominant variant. T1 and T2 features were diagnosed in 46.2% and 28.8% of cases, respectively. Lymph node metastases in the central neck compartment were present in 29 cases (56.9%), whereas they were present in the lateral neck compartment or the upper mediastinum in 26 cases (50.0%). Distant metastases occurred in 40% of cases (21 patients), among them lung metastases (20 patients), bone, central nervous system, thymus (2 persons in each localization), and liver (1 patient). More than half of the metastases (57.1%) showed radioiodine uptake. The excellent treatment response (according to ATA guidelines [7
]) was obtained in 42.9% of cases, whereas in the remaining cases, the incomplete structural response was the final treatment outcome. Seven patients died, six persons due to metastatic disease, and one person because of advanced locoregional PTC.
The validation of the IGFBP3
gene was performed on an independent PTC set, including 63 cases, among them 9 with metastatic disease (Table S2
The use of human tissue was approved by the Bioethics Committee at Maria Sklodowska-Curie National Research Institute of Oncology Gliwice Branch. Written informed consent to analyze the tissue was obtained from all patients. All clinical data were anonymized and de-identified before the analysis.
4.2. Microarray Analysis
Data from three types of oligonucleotide array were selected for this study: HG-U133A, HG-U133 Plus 2.0, and Human Gene 1.0 ST Array (Affymetrix, Santa Clara, CA, United States). RNA was isolated from fresh frozen tumor fragments with the RNeasy Mini Kit (Qiagen, GmbH, Hilden, Germany) as recommended by the manufacturer. Microarray analysis was performed according to the manufacturer recommendations, different for each array type. However, the main steps of the laboratory protocol were common. In brief, RNA was the template for double-stranded cDNA synthesis, followed by transcription combined with cRNA biotinylation, cRNA fragmentation, and cRNA hybridization to the arrays. After washing (Fluidic Station 450, Affymetrix) and staining with streptavidin-phycoerythrin, conjugate arrays were scanned in the GeneChip 3000G scanner (Affymetrix).
The main difference between HG-U133 microarrays and Human Gene 1.0 arrays concerns the number of target-oriented probes. The HG-U133 type arrays contain 11 probes per transcript, which are located mainly around the 3′ end. Meanwhile, in the Human Gene 1.0 arrays, there are more probes designed to be distributed across the transcribed regions of each gene. Moreover, the HG-U133 arrays contain a perfect match probe (PM) and a mismatch probe (MM), used together to measure the abundance of mRNA transcripts. The MM probes have a 13th base that does not match the target sequence, allowing for detection of cross-hybridization. The Human Gene 1.0 arrays, in turn, represent a perfect match-only array design. Mentioned differences force the use of different algorithms in data analysis.
4.3. qPCR Analysis
RNA for qPCR analysis was isolated from fragments of fresh frozen PTC tumors using the RNeasy Mini Kit (Qiagen), as recommended by the manufacturer. qPCR was carried out for 28 genes (listed in Table S1
) with the 7900HT Fast Real-Time PCR System (Life Technologies, Carlsbad, CA, USA) and the use of Roche Universal Probe Library (Roche, Basel, Switzerland). Primer sequences are given in Table S3 (supplementary materials)
. Each sample was examined in duplicates. For the normalization of qPCR data, the Pfaffl method and GeNorm application were used. Three normalization genes were selected: EIF3S10
(eukaryotic translation initiation factor 3, subunit 10 theta), HADHA
(hydroxyacyl-CoA dehydrogenase trifunctional multienzyme complex subunit alpha), and UBE2D2
(ubiquitin conjugating enzyme E2 D2). The differences between M0 and M1 patients were tested with the two-tailed Mann–Whitney U test.
4.4. Statistical Analysis of Clinical Data
Categorical data were summarized with numbers and percentages. Continuous data were summarized with medians and ranges. Comparisons of categorical variables were performed using Fisher’s exact test. Comparisons of continuous variables were performed using the two-tailed Mann–Whitney U test. p
values < 0.05 were considered statistically significant. Statistical analyses were performed using the R software version 3.6.2 and “tableone” package version 0.11.1. [73
4.5. Microarray Data Analysis
We obtained three PTC microarray datasets: 44 HG-U133A microarrays, 27 HG-U133 Plus 2.0 microarrays, and 52 Human Gene 1.0 ST microarrays (123 samples in total).
Background correction, normalization and probe set summarization were done using the Robust Multichip Average (RMA) algorithm with library oligo v 1.50.0 from R v3.6.2 environment, and custom CDF files from BrainArray (ENTREZG; v24), for each of three microarray datasets separately [75
]. Twelve thousand and one genes, common for all three types of microarray, were selected and used in further analysis.
To create microarray dataset A, we combined HG-133A and HG-U133 Plus 2.0 microarray datasets and removed a batch effect associated with microarray type using linear models for microarray data (limma) from limma 3.42.2 library [78
]. We selected differentially expressed genes, using limma, with a batch factor incorporated in the linear model. p
-values were adjusted for multiple testing by the Benjamini and Hochberg false discovery rate (FDR) method [79
]. Corrected p
-values < 0.05 were considered statistically significant.
The validation microarray dataset B was comprised of Human Gene 1.0 ST microarrays. We performed differential gene analysis using limma. p-values were adjusted using Bonferroni correction, on a subset of 11 preselected genes. Corrected p-values < 0.05 were considered statistically significant. Using the validation microarray dataset, we also estimated the probability that from 11 randomly selected genes, a minimum of two would be positively validated, with absolute SLR larger than or equal to 0.83. We performed 10,000 iterations. In each iteration, we randomly selected 11 genes (out of 12,001 genes analyzed on validation microarray dataset B), obtained p-values from differential gene analysis, and applied Bonferroni correction. Furthermore, in each iteration, we checked whether a minimum of two genes were significant (according to criterion: Adjusted p-value < 0.05) and whether a minimum of two of them showed an SLR above 0.83. We calculated the probability as the number of iterations that fulfilled the criteria divided by the number of all iterations.
4.6. Gene Set Enrichment Analysis
Gene set enrichment analysis (GSEA) was performed using Gene Ontology (GO) Biological Process (BP), GO Molecular Function (MF), GO Cellular Compartment (CC), the Kyoto Encyclopedia of Genes and Genomes (KEGG), and two collections from Molecular Signatures Database v7.1 (MSigDB) [80
]: collection “hallmark gene sets” (H) and sub-collection “chemical and genetic perturbations” (CGP) being a part of the collection “curated gene sets” (C2) [82
The analysis was performed in the R environment, using clusterProfiler 3.14.3 library [85
]. The t-statistic obtained in limma analysis was used as a gene ranking metric. The p
-values were calculated based on 10,000 random gene set permutations. Gene sets that involved between 10 and 600 genes were analyzed. p
-values were adjusted for multiple testing by the Benjamini and Hochberg false discovery rate (FDR) method. Gene sets with corrected p
-values < 0.05 were considered statistically significant. Redundancy amongst GO terms was removed using the ClusterProfiler simplify function with the similarity measure cut-off of 7.0, and the p
-value used as the deciding variable. To test whether there is a significant overlap between significant gene sets obtained in datasets A and B, the Fisher exact test was used.