Genomic and Transcriptomic Characteristics According to Size of Papillary Thyroid Microcarcinoma

It is controversial as to whether papillary thyroid microcarcinoma (PTMC) has some genomic and transcriptomic characteristics that differentiate between an early-stage lesion that would eventually evolve into the larger papillary thyroid cancer (PTC), and an occult indolent cancer in itself. To investigate this, we comprehensively elucidated the genomic and transcriptomic landscapes of PTMCs of different sizes, using a large-scaled database. This study included 3435 PTCs, 1985 of which were PTMCs. We performed targeted next-generation sequencing for 221 PTCs and integrated these data with the data including The Cancer Genome Atlas (TCGA) project. The frequency of v-raf murine sarcoma viral oncogene homolog B (BRAF)V600E mutation was higher in PTMCs >0.5 cm than that in very small PTMCs (≤0.5 cm) and decreased again in PTCs >2 cm. Among PTMCs, the prevalence of mutations in rat sarcoma (RAS) and telomerase reverse transcriptase (TERT) promoter was not significantly different according to their size, but lower than in large PTCs. There was no change in the tumor mutational burden, the number of driver mutations, and transcriptomic profiles with tumor size, among PTMCs and all PTCs. Although a few genes with differential expression and TERT promoter mutations were found in a few PTMCs, our findings showed that there were no useful genomic or transcriptomic characteristics for the prediction of the future progression of PTMC.


Introduction
Papillary thyroid microcarcinoma (PTMC) is defined as a papillary thyroid cancer (PTC) measuring 1 cm or less in maximal diameter. In recent years, there has been a trend to prevent PTMC overdiagnosis and reduce the indication or extent of thyroidectomy for PTMCs, as most PTMCs show favorable prognosis [1,2]. Therefore, the 2015 American Thyroid Association guidelines have suggested active surveillance as an alternative option for PTMC without high-risk features [3]. However, some PTMCs exhibit aggressive behavior such as lateral neck lymph node and distant metastases, recurrence, and even death [4,5]. Two types of PTMCs undergo different processes, resulting in either an early-stage lesion that eventually evolves into PTC, or the formation of an occult indolent cancer in itself, and it is unclear whether they can be distinguished by their sizes alone [6]. Most of the evaluations have been based on the differences in the histological findings and the mutations between PTMC and PTC >1 cm. Unlike with histological findings, such as lymph node metastasis and extra-thyroidal extension, in most studies, no differences were observed in the proportion of major driver mutations between PTMCs and large PTCs [7,8]. However, telomerase reverse transcriptase (TERT) promoter mutation, known to be a strong predictor of poor prognosis in thyroid tumors, was more frequently found in large PTCs than in PTMCs [9,10], and this low frequency of TERT promoter mutation in PTMCs makes it difficult to compare the frequencies among PTMCs. The genomic profile was also not different among PTMCs with different statuses of lateral lymph node metastasis, whereas transcriptomic differences were found, showing 43 differentially expressed genes (DEGs) [11]. This finding suggests that molecular characteristics, especially transcriptomic changes, would be different according to the histologic characteristics associated with tumor aggressiveness. Meanwhile, the prevalence of lymph node metastasis and extra-thyroidal extension has been known to increase with an increase in PTMC tumor size [7,12], suggesting that it is important to understand the differences in the molecular characteristics of the tumor according to tumor size; however, to our knowledge, no study has investigated this yet. Therefore, in this study, we aimed to investigate the comprehensive genomic and transcriptomic landscapes, according to tumor size, even within PTMCs.

Genomic Characteristics of PTMC Compared to PTC over 1 cm
We performed targeted next-generation sequencing for 221 PTCs (93 PTMCs and 128 PTCs >1 cm) and integrated these data with the data of non-overlapping PTC cases from three previous studies of Seoul National University Hospital (SNUH) (1853 PTMCs and 865 PTCs >1 cm) [9,13,14] and that from The Cancer Genome Atlas (TCGA) study (39 PTMCs and 457 PTCs >1 cm) [15]. In total, 3435 PTC patients (1985 PTMCs and 1450 PTCs >1 cm) were included. Clinicopathological and genomic characteristics are summarized in Table 1. PTMC had more indolent clinicopathological features than PTC >1 cm, such as a lower proportion of male patients, extrathyroidal extension, and lymph node metastasis. Moreover, in PTMC, the classical subtype was more frequent and the follicular-variant subtype was relatively rare.
V-raf murine sarcoma viral oncogene homolog B (BRAF) V600E , neuroblastoma-/Kirsten-/Harvey-rat sarcoma viral oncogene homolog (N/H/K-RAS), and TERT promoter mutations were recognized as the most frequent mutations of PTC. Compared to PTC >1 cm, PTMC had a higher prevalence of BRAF V600E mutation (p < 0.001), and lower prevalence of RAS and TERT promoter mutations (p = 0.033 and <0.001, respectively). Genomic characteristics other than the major mutations, including information of other mutations, tumor mutational burden (TMB), and number of driver mutations, were available only from the SNUH targeted sequencing and TCGA datasets (n = 221 and 496, respectively). The frequency of mutations other than BRAF V600E , RAS, and TERT promoter mutations was not different between PTMC and PTC >1 cm in both datasets. The following mutations in oncogenes or tumor suppressor genes were found in only one or two cases in the SNUH targeted sequencing dataset: KIT proto-oncogene receptor tyrosine kinase (KIT) R965W , phosphoinositide 3-kinase subunit p110-beta (PIK3CB) R562Q , epidermal growth factor receptor (EGFR) L813R , FMS-like tyrosine kinase 4 (FLT4) R1060W , ataxia telangiectasia mutated (ATM) T2921M , and neurofibromatosis type 1 (NF1) W1997* in PTMC; KIT R965W , adenylate kinase 1 (AKT1) E17K , Janus kinase (JAK) 2L808W , ATM S2408L , cyclin-dependent kinase inhibitor 2A (CDKN2A) R61C , large tumor suppressor homolog 1 (LATS1) R657H , LATS1 S771* , and breast cancer gene (BRCA) D932fs in PTC >1 cm. In TCGA dataset, rearranged during transfection (RET) fusions were the most frequent alterations (5.1% of PTMC and 6.8% of PTC >1 cm). The prevalence of cases in which no driver mutations were found was also similar between PTMCs and large PTCs. Moreover, TMB and number of driver mutations of PTMCs did not differ from those of PTCs >1 cm.

Genomic Characteristics of PTC According to Tumor Size
Next, we compared the clinicopathologic and genomic characteristics of PTCs according to tumor size after dividing by millimeter (Table 2; Figure 1). Even within PTMC, the large-sized PTMCs had high-risk clinicopathological features such as extrathyroidal extension and lymph node metastasis, despite having a high frequency of the follicular variant subtype. Among PTCs >1 cm, follicular variant subtype and lymph node metastasis were frequently observed in large tumors, whereas the frequency of extra-thyroidal extension was decreased in tumors >2 cm ( Table 2).  Because follicular variant PTC had a lower BRAF V600E mutation rate and a higher RAS mutation rate than classical PTC (31.6% vs. 73.6% for BRAF V600E and 30.0% vs. 4.1% for RAS), we classified PTCs by the histologic subtype (classical and follicular variant PTCs), and analyzed the prevalence of major mutations according to the tumor size in each subtype. In classical PTMC, the prevalence of BRAF V600E mutation was significantly lower in PTMC ≤0.5 cm than that in large PTMC (p < 0.001; Figure 1A). Because follicular variant PTC had a lower BRAF V600E mutation rate and a higher RAS mutation rate than classical PTC (31.6% vs. 73.6% for BRAF V600E and 30.0% vs. 4.1% for RAS), we classified PTCs by the histologic subtype (classical and follicular variant PTCs), and analyzed the prevalence of major mutations according to the tumor size in each subtype. In classical PTMC, the prevalence of BRAF V600E mutation was significantly lower in PTMC ≤0.5 cm than that in large PTMC (p < 0.001; Figure 1A).
The high prevalence of BRAF V600E mutation was constant in 0.6-2 cm PTC and then decreased in PTC >2 cm (p for trend = 0.001). The frequency of RAS mutations was constant regardless of tumor size. In follicular-variant PTMC, the frequency of BRAF V600E mutation in PTMC ≤0.5 cm was lower than that in larger PTMC, despite the lack of statistical significance, and it gradually decreased, especially in tumors >1 cm (p for trend <0.001; Figure 1B). The prevalence of RAS mutations was not significantly different according to tumor size in follicular variant PTC, as was the case in classical PTC. Among 66 of 978 tumors harboring TERT promoter mutations, 61 (92.4%) were PTC >1 cm and only 5 (7.6%) were PTMC (Table 1; Figure 1C,D). Interestingly, two tumors ≤0.5 cm also harbored TERT promoter mutations, resulting in the similar prevalence among PTMCs (3.4% in ≤0.5 cm, 1.2% in 0.6-0.7 cm, 1.6% in 0.8-1.0 cm; Table 1). However, the frequency of TERT promoter mutations showed a gradual increase with size in tumors >1 cm (p for trend <0.001; Table 2). The coexistence of TERT promoter and BRAF V600E mutation was found mostly in PTCs >1 cm, except one case with the size of ≤0.5 cm. The frequency of the coexistence was increased with size among large-sized tumors of classical PTC (p for trend <0.001; Figure 1C). The coexistence of TERT promoter and RAS mutations showed similar findings among large-sized tumors of follicular variant PTC ( Figure 1D), and there was no PTMCs in these cases.
In both the SNUH targeted sequencing and TCGA datasets, TMB did not change with tumor size, either among PTMCs or all PTCs ( Figure 1E). The number of driver mutations in both datasets also did not change with the tumor size ( Figure 1F). Because the alterations in mitogen-activated protein kinase (MAPK)/extracellular signal-regulated kinase (ERK) and phosphatidylinositol 3-kinase (PI3K)/protein kinase B (AKT) signaling pathways are the most frequently found and play a major role of tumorigenesis and progression in thyroid cancer [15], we analyzed the proportion of alterations in genes involved in these pathways compared to total non-silent mutations ( Figure 1G,H). The average of proportions of each pathway was 32.9% and 11.4% in the SNUH targeted sequencing dataset, and 9.9% and 3.9% in TCGA dataset. When we compared this according to tumor size of PTC, there were no significant differences in both pathways.

Transcriptomic Characteristics of PTMC
Using the RNA sequencing database of SNUH (43 PTMCs and 81 PTCs >1 cm) and TCGA (39 PTMCs and 457 PTCs >1 cm) cohorts [14,15], we identified the transcriptomic characteristics of PTMC. Overall, principal component analysis (PCA) demonstrated that the gene expression profile was not distinguished between PTMC (various blue-colored closed circles) and PTC >1 cm (grey-colored open circles). Moreover, no distinct characteristics were found within the PTMCs according to tumor size (Figure 2A,B). Because the PTC transcriptome has been reported to be distinguishable on the basis of the mutational status of BRAF V600E and RAS [10,14,15], we stratified PTCs according to these mutations. In the SNUH RNA-sequencing dataset, when we analyzed all PTCs or RAS mutant PTCs, there were several DEGs between PTMC and PTC >1 cm. However, these genes did not have common molecular pathways, and were difficult to consider as significant genes ( Figure 2C; Table 3). Moreover, the transcriptomes of BRAF mutant PTCs were nearly identical between PTMC and PTC >1 cm. In TCGA dataset, the gene expression profile of PTMC was comparable with that of PTC >1 cm when we analyzed all PTCs as well as the BRAF-mutant or RAS-mutant PTCs ( Figure 2D; Table 3). Next, we classified PTMCs on the basis of their size (0.6, 0.7, and 0.8 cm) with the appropriate number of tumors in each group for analysis. We could not analyze DEGs in RAS mutant PTMCs due to their low prevalence (four and three PTMCs in the SNUH and TCGA datasets, respectively). Similar to the comparison between PTMC and PTC >1 cm, there were few DEGs in PTMCs of each group, which also did not have common functional pathways, in all PTMCs ( Figure 3A,B; Table 3) and BRAF mutant PTMCs ( Figure S1; Table 3) of both datasets.
TERT mRNA has been known as a promising prognostic marker in thyroid cancer. Nonetheless, the expression level of this gene was very low in overall PTC (baseMean, 0.66 and 2.46 in SNUH RNA-sequencing and TCGA datasets, respectively). TERT was not included in DEGs and its Next, we classified PTMCs on the basis of their size (0.6, 0.7, and 0.8 cm) with the appropriate number of tumors in each group for analysis. We could not analyze DEGs in RAS mutant PTMCs due to their low prevalence (four and three PTMCs in the SNUH and TCGA datasets, respectively). Similar to the comparison between PTMC and PTC >1 cm, there were few DEGs in PTMCs of each group, which also did not have common functional pathways, in all PTMCs ( Figure 3A,B; Table 3) and BRAF mutant PTMCs ( Figure S1; Table 3) of both datasets.
TERT mRNA has been known as a promising prognostic marker in thyroid cancer. Nonetheless, the expression level of this gene was very low in overall PTC (baseMean, 0.66 and 2.46 in SNUH RNA-sequencing and TCGA datasets, respectively). TERT was not included in DEGs and its expression did not show differences according to tumor size in both datasets ( Figure 3C). Furthermore, the proportion of cases with TERT mRNA expression also did not differ according to PTC size ( Figure 3C).
To investigate whether there were differences in thyroid differentiation or MAPK activation depending on tumor size, thyroid differentiation score (TDS) and ERK score were used ( Figure 3D,E). There were no changes and trends in TDS and ERK score according to tumor size, among PTMCs as well as in all PTCs, except for TDS of PTC >1 cm lower than that of PTMC in TCGA dataset alone (p = 0.04).
Cancers 2020, 12, x 9 of 16 expression did not show differences according to tumor size in both datasets ( Figure 3C). Furthermore, the proportion of cases with TERT mRNA expression also did not differ according to PTC size ( Figure 3C). To investigate whether there were differences in thyroid differentiation or MAPK activation depending on tumor size, thyroid differentiation score (TDS) and ERK score were used ( Figure 3D,E). There were no changes and trends in TDS and ERK score according to tumor size, among PTMCs as well as in all PTCs, except for TDS of PTC >1 cm lower than that of PTMC in TCGA dataset alone (p = 0.04).   ). Thyroid differentiation score (D) and ERK score (E) according to tumor size from both datasets (n = 124 and 496 for SNUH-RS and TCGA datasets, respectively). PTMC, papillary thyroid microcarcinoma.

Discussion
This study showed the genomic and transcriptomic characteristics of PTC according to tumor size, particularly focusing on PTMC using an integrated large dataset of 3435 PTCs, including 1985 PTMCs from the SNUH and TCGA cohorts. PTMC was more likely to be developed by BRAF V600E mutation, and less frequently by RAS or TERT promoter mutations, compared to PTC >1 cm. In particular, the frequency of BRAF V600E mutation was higher in PTMC >0.5 cm than that in very small PTMC (≤0.5 cm), and decreased again in larger PTCs. The difference in the prevalence of RAS mutations according to tumor size disappeared when the effect of histologic subtype of PTC was excluded. That is, the frequency of RAS mutations was higher in large PTCs because of the higher proportion of follicular variant PTC in them. TERT promoter mutations, especially in coexistence with BRAF V600E mutation, were frequently observed in larger classical PTCs, and coexistence with RAS mutations were only found in PTC >1 cm. There was no change in TMB and number of driver mutations with tumor size, either among PTMCs or all PTCs. Transcriptomic profiles were not distinguishable, not only between PTMC and PTC >1 cm but also among PTMCs, even when they were classified according to their size.
It is controversial as to whether PTMC could be classified into different forms: one, an early form of PTC, which will evolve into PTC on the same spectrum, and the other, an occult indolent form, which is a different subset that will not progress into PTC. To investigate this, we tried to compare the genomic and transcriptomic characteristics of PTMCs of different sizes in a large-scaled database. Overall, the genomic characteristics of the PTMCs were similar to those of 1.1-2 cm PTCs, with no differences in the prevalence of major mutations in each histologic subtype (classical and follicular variant PTCs), TMB, and the number of driver mutations, although there was a limitation in not being able to explore non-targeted variants. Regarding TMB, it has been reported to be higher in anaplastic thyroid cancer than in differentiated thyroid cancer [16], and has been associated with worse prognosis in follicular thyroid cancer [17], but there was no difference according to the various size of PTC in this study. Moreover, transcriptomic characteristics based on mRNA sequencing data also showed no significant differences among PTMCs, and even between PTMCs and larger PTCs. These results were consistent with our previous reports that PTMC behaves like a larger PTC; the prevalence of BRAF V600E mutation and immunohistochemical staining results were not different from those obtained for PTCs [7], with no differences being observed in the gene expression profiles of PTMCs and PTCs >1 cm on the basis of microarray analysis [18]. These results suggest that the majority of the PTMCs could be representing an early state that can eventually progress to a large PTC.
In contrast, several previous studies [19,20] inferred from clinical outcomes that incidental PTMCs in particular, which were diagnosed in autopsies or after thyroidectomy secondary to benign disease, carry an excellent prognosis and do not ever evolve toward PTC. In our analysis among PTMCs, the prevalence of BRAF V600E mutation was lower in PTMC ≤0.5 cm than that in larger PTMCs in both classical and follicular variant PTMCs, though the difference was not statistically significant in the follicular variant PTMCs. This partly supported the possibility that among very small PTMC (≤0.5 cm), an occult or indolent cancer different from the early PTC could exist. Another form, noninvasive follicular thyroid neoplasm with papillary-like nuclear features (NIFTP), a recently accepted entity, could be included as an indolent subset of PTMCs in this study. However, considering that NIFTP tends to be larger in size and had a higher frequency of RAS mutation at 64.7% in our study [21], the effects contaminated by NIFTPs in our PTMC subjects might be very small, although we could not completely exclude the influence. Nevertheless, to clarify these points, very small-sized tumors (≤0.5 cm) and subcentimiter non-invasive encapsulated, follicular-variant PTCs should be further investigated in future studies.
It is true that not all PTMCs may have the same fate. Recently, Perera et al. [11] presented that the genomic profiles of PTMC with or without lateral neck lymph node metastases were comparable, whereas they found 43 DEGs between them that did not overlap with DEGs identified in this study. PTMCs harboring TERT promoter (3.6%) and tumor protein p53 (TP53) (1.8%) mutations showed coexistent BRAF V600E mutations and were restricted to the samples with lateral neck lymph node metastases. One of the four TERT mutants and one of the two TP53 mutant PTMCs were 0.15 and 0.25 cm in size, respectively. In the current study, two small PTMCs (0.3 cm and 0.4 cm in size) harbored TERT promoter mutations, although they did not exhibit aggressive clinicopathological features such as lymph node or distant metastasis, which suggests that even PTMCs may have a potential for poor prognosis, regardless of tumor size.
Several studies have predicted molecular markers for PTMC progression [7,11,18]. In this study, although TERT promoter mutations in several PTMC cases and few DEGs among PTMCs were found, the frequency of TERT promoter mutations and the number and significance of DEGs were too low to be useful for clinical applications. Therefore, for PTMC patients undergoing active surveillance instead of thyroidectomy, there would be no useful molecular markers to predict the progression of PTMC, and regular follow-up of ultrasonography should be emphasized. However, to clarify our results, because of the retrospective design of this study, it is necessary to perform next-generation sequencing on the fine-needle biopsy samples from PTMC patients at diagnosis and the samples from those who have undergone surgery due to tumor progression in the ongoing prospective studies of active surveillance [2,22,23].

Patients and Tissue Samples
A total of 3435 patients with PTC (1985 PTMCs and 1450 PTCs >1 cm) were included. Formalin-fixed, paraffin-embedded tissues from 221 PTC patients were analyzed using targeted next-generation sequencing method. We integrated these data with the data obtained from four previous studies of PTC including TCGA project [9,[13][14][15]. For targeted sequencing samples, cases were selected if they had residual DNA after Sanger sequencing and met the sequencing quality and quantity criteria, among samples from the previous studies of SNUH [9,13]. All tissue samples from our institution were acquired after thyroid surgery between 1993 and 2014. The pathological diagnosis was based on the seventh edition of the American Joint Committee on Cancer (AJCC) tumor-node-metastasis (TNM) staging system for differentiated thyroid cancer [24], and histological diagnosis was based on the third edition of the World Health Organization (WHO) classification for thyroid tumors [25]. The research protocol was approved by the institutional review board committee of the SNUH (no. H-1207-124-420). Information of datasets and sequencing types is summarized in Table S1 and described in detail in each reference paper.

Library Preparation and Sequencing
We performed targeted sequencing for 612 kinase and cancer-related genes (Table S2). Target regions from 500 ng genomic DNA were captured using the Agilent SureSelectXT DNA Kinome Kit, following the manufacturer's protocols (Agilent, Santa Clara, CA, USA). Briefly, DNA was sheared by the Covaris system (Covaris, Woburn, MA, USA) and purified using Agencourt AMPure XP beads (Beckman Coulter, Brea, CA, USA). The fragment ends were repaired and adaptors were ligated to the fragments. The resulting DNA library was purified using the Agencourt AMPure XP beads and amplified by polymerase chain reaction (PCR). The quality and quantity of the DNA library was assessed with the Agilent 2100 Bioanalyzer or Tapestation (D1K Tape), which was captured by hybridization to the biotinylated RNA library baits. The bound genomic DNA was purified with streptavidin-coated magnetic Dynabeads (Invitrogen, Carlsbad, CA, USA) and then re-amplified. The targeted DNA library was sequenced on Illumina Hiseq 2500 with 100 base-pair paired-end reads using recommended protocols from the manufacturer (Illumina, San Diego, CA, USA).

Sequence Data Analysis
The sequenced reads were aligned to the human reference genome (hg19) using the Burrows-Wheeler Aligner (BWA) [26]. The PCR duplicates were removed using Picard MarkDuplicate (http://picard.sourceforge.net) and the filtered reads were realigned at known indel positions using GATK IndelRealigner [27]. The base quality score was recalibrated by GATK BaseRecalibrator. We called single nucleotide variants (SNVs) using GATK HaplotypeCaller and MuTect [28]. GATK HaplotypeCaller was also used for indel detection. When we called variants with MuTect, tumor samples were compared against pooled normal. Variants called from these programs were used for further filters and analysis. To discover driver mutation candidates from each cancer sample, we applied additional filtration criteria as follows: (1) allele frequency <0.0001 in individuals of normal population databases of the Exome Aggregation Consortium (ExAC); (2) non-silent SNVs (nonsynonymous, splice-site) and frameshift indels; (3) predicted as a pathogenic variant, "deleterious" in variant predictor SIFT [29] and "probably damaging" or "possibly damaging" in PolyPhen-2 [30]; (4) variants annotated in COSMIC and Cancer Gene Census [31]. The variants highlighted in this study were subsequently manually reviewed. TMB was defined as the number of non-silent mutations per megabase.

Gene Expression Profiling, Differentially Expressed Gene Analysis, and Gene Scoring
The DEGs were determined by DESeq2 [32] to have q-value < 0.05, |Log 2 (fold change)| ≥ 1, and baseMean ≥ 100, and were illustrated using volcano plots, which show the magnitude and statistical significance of differential translation for each gene. The calculated p-values were adjusted to q-values for multiple testing using the Benjamini-Hochberg correction. The normalized gene expression values were applied to PCA using the most variable 500 genes. TDS and ERK scores were calculated by methods in the previous studies [14,15], and defined as the average of median-centered rlog values from DESeq2, across 16 thyroid metabolism and function genes, and 52 MAPK signaling pathway genes, respectively.

Statistical Analysis
Statistical analysis and visualization were performed using R statistical package. Categorical data were assessed using either the Pearson's χ 2 or Fisher's exact test. Continuous data were analyzed using the independent t-test, Wilcoxon rank sum test, or the analysis of variance test. Binomial or multinomial logistic regression analysis was performed to access the difference in risk factors for categorical outcomes.

Conclusions
In conclusion, no specific genomic or transcriptomic characteristics were found to differ according to tumor size among PTMCs as well as all PTCs including PTC >1 cm, suggesting that a large portion of PTMCs exist in an early form, which would eventually progress to form a large PTC. Although a few genes with differential expression and TERT promoter mutations were found in a few PTMCs, no useful genomic or transcriptomic characteristics were found for the prediction of the future progression of PTMC, bearing important implications for PTMC patients on active surveillance.

Supplementary Materials:
The following are available online at http://www.mdpi.com/2072-6694/12/5/1345/s1, Figure S1: Transcriptomic characteristics of BRAF mutant papillary thyroid microcarcinoma according to tumor size. Table S1: Information of datasets included in the study. Table S2: List of genes included in the SNUH targeted sequencing dataset.