Molecular Tests for Risk-Stratifying Cytologically Indeterminate Thyroid Nodules: An Overview of Commercially Available Testing Platforms in the United States

: The past decade has witnessed signiﬁcant advances in the application of molecular diagnostics for the pre-operative risk-stratiﬁcation of cytologically indeterminate thyroid nodules. The tests that are currently marketed in the United States for this purpose combine aspects of tumor genotyping with gene and/or microRNA expression proﬁling. This review compares the general methodology and clinical validation studies for the three tests currently offered in the United States: ThyroSeq v3, Aﬁrma GSC and Xpression Atlas, and ThyGeNEXT/ThyraMIR.


Background
Thyroid nodules are a common clinical finding, with palpable thyroid nodules detected in 1% of men and 5% of women in iodine-sufficient regions of the world [1][2][3][4][5].However, with the increasing use of radiologic screening, one or more thyroid nodules are found in 60-70% of the population [1,6].Most thyroid nodules (>90%) are asymptomatic and benign; of those that are malignant, the majority are low-risk neoplasms that do not alter survival.The current management and malignancy risk assessment of thyroid nodules involves a synthesis of clinical findings, ultrasound evaluation, and fine-needle aspiration (FNA) [1,7].Specific ultrasound characteristics of thyroid tumors, such as hypoechogenicity, taller-than-wide shape on transverse view, irregular margins, microcalcifications/echogenic foci, and apparent extrathyroidal extension are associated with malignancy.These and other ultrasound features have been employed by various professional societies to develop ultrasound risk assessment schemes to select thyroid nodules for FNA biopsy [1,[8][9][10].
FNA has proven to be an important tool in the management of thyroid nodules.Currently, thyroid FNA cytology is reported in a standardized fashion by employing one of the known tiered classification systems [3,[11][12][13][14].While FNA cytology can effectively differentiate between benign and malignant thyroid lesions in up to 75% of cases, 20-30% of aspirated nodules are reported as indeterminate for malignancy [14,15].Each of the thyroid FNA classification schemes includes diagnostic categories to facilitate reporting of thyroid nodules that lack clearly defined cytomorphologic features of malignancy (Table 1) [14].Some authors have suggested inclusion of the "Suspicious for Malignancy" interpretive category among indeterminate cases; however, the relatively high cancer risk and different management recommendations for nodules classified as "Suspicious for Malignancy" have prompted other authors to consider this category separately from the lower-risk indeterminate categories [4,14,16].In general, the risk of malignancy for nodules in the lower-risk indeterminate categories ranges from 10 to 40%.This wide range of malignancy risk likely reflects variations in (1) the prevalence of thyroid cancer among different patient populations, (2) radiologic criteria for selecting nodules for FNA biopsy, (3) cytopathologists' thresholds for classifying aspirates as indeterminate, and (4) surgical pathologists' diagnostic thresholds for classifying thyroid nodules on thyroidectomy specimens.In spite of the challenges raised by indeterminate thyroid cytology diagnoses, both clinicians and pathologists generally agree that reporting of thyroid FNA cytology using standardized, tiered classification schemes is an important component of patient management [2,17].In recent years, advances in the clinical, pathologic, and molecular characterization of thyroid neoplasms have had major impacts on thyroid nodule management.An example of such a development is the establishment of strict criteria for reclassifying a subset of tumors that were formerly considered malignant (i.e., non-invasive, encapsulated follicular variant of papillary carcinoma) as indolent neoplasms (non-invasive follicular thyroid neoplasm with papillary-like nuclear features (NIFTP)) [18][19][20][21].This nomenclature change from "carcinoma" to "neoplasm" has reduced the cancer risk estimates associated with both the lower-risk indeterminate and "Suspicious for Malignancy" categories of thyroid FNA reporting systems [22,23].Additionally, progress in understanding the molecular pathogenesis of thyroid neoplasms has fueled the development of molecular diagnostic tests for improving the preoperative risk-stratification of cytologically indeterminate thyroid nodules [24].The primary goal of these molecular tests is to correctly identify benign nodules among those with indeterminate cytology, thereby decreasing the number of unnecessary diagnostic surgeries [25][26][27][28].The updated versions of thyroid FNA classification schemes are now aligned with the aforementioned changes in histopathologic classification of thyroid neoplasms, applicability of molecular testing, and current recommendations for management.

Molecular Profiles of Thyroid Tumors
For the purposes of ancillary testing, the molecular changes that have been described in thyroid tumors may be conceptualized as those that occur on the DNA level versus those that manifest as downstream alterations in gene and/or microRNA expression profiles.

DNA Sequence Alterations, Gene Fusions, and Chromosomal Copy-Number Alterations
Genomic characterizations of various types of thyroid cancer have identified recurrent somatic mutations (single nucleotide variants, insertions, deletions) and gene fusions in hotspots within oncogenes and tumor suppressor genes as well as large-scale chromosomal copy number alterations [29][30][31][32][33]. Two key points are helpful to keep in mind regarding these genetic alterations, with respect to their ability to risk-stratify cytologically indeterminate thyroid nodules.
First, for thyroid FNA-based molecular testing, the detection of mutations and gene fusions is not a binary "positive" or "negative" result.Instead, these tests offer a gradient of cancer risk estimates based on the type of genetic alteration that is detected.Driver alterations, such as the BRAF V600E mutation, and chromosomal rearrangements involving RET, ALK, or NTRK genes are essentially diagnostic of cancer in the context of thyroid tumors [30,34,35].Detection of one of these alterations typically warrants thyroidectomy for therapeutic purposes, with consideration of total thyroidectomy as the initial surgical approach if the likelihood of requiring adjuvant radioactive iodine treatment is high, based on clinical and sonographic features (e.g., tumor size > 4 cm, suspicion of extrathyroidal extension, lymph node metastasis, or involvement of contralateral lobe).In contrast, mutations in the RAS gene family (HRAS, KRAS, NRAS), BRAF K601E mutation, PAX8-PPARG gene fusion, and gene fusions involving THADA are specific for neoplasia but not necessarily for malignancy; these alterations have been detected in benign (follicular adenoma), pre-malignant (noninvasive follicular thyroid neoplasm with papillary-like nuclear features (NIFTP)), and malignant tumors (follicular carcinoma, encapsulated follicular variant of papillary carcinoma) [29,30,[36][37][38].Given their modest specificity for cancer, detection of one of these alterations alone generally warrants diagnostic lobectomy rather than total thyroidectomy as the initial surgical approach.
Second, driver alterations often align with tumor phenotype.The BRAF V600E mutation and other driver alterations that result in similar "BRAF V600E-like" gene expression profiles are associated with classical, tall-cell variant and infiltrative follicular variant papillary carcinomas.In contrast, RAS mutations, BRAF K601E mutation, PAX8-PPARG gene fusion, and gene rearrangements involving THADA are associated with a spectrum of follicular-patterned neoplasms, as indicated above.Mutations in the TP53 and TERT promoter region, particularly in combination with the aforementioned driver alterations, are found with increased frequency in aggressive thyroid cancers, including poorly differentiated carcinoma and anaplastic carcinoma [39][40][41].Finally, Hürthle cell neoplasms have distinctive mutations in mitochondrial DNA as well as recurrent chromosome-level copy number alterations; additional somatic mutations superimposed on this background, including those involving TERT, TP53, and RAS-family genes, have been reported in Hürthle cell carcinomas [31,32,[42][43][44].In this light, ancillary tests for thyroid FNAs that detect mutations, gene fusions, and copy-number alterations can inform clinicians about a tumor's phenotype and potentially about its aggressiveness as well.Apart from diagnostic value, the detection of driver alterations in advanced-stage thyroid cancers may identify tumors amenable to targeted therapies [45].In uncommon cases, genotyping results from a thyroid FNA may also detect genetic alterations associated with familial cancer syndromes, in which case germline genetic testing may be recommended.

Gene and microRNA Expression Profiles
The different transcriptional profiles of benign and malignant thyroid tumors offer an additional opportunity for risk-stratifying cytologically indeterminate nodules on FNA samples.In contrast to specific driver mutations and gene fusions, gene [46][47][48] and microRNA [49][50][51][52] expression profiles can be understood as readouts of the cumulative and complex genetic, epigenetic, and possibly environmental influences on cells.Largescale gene expression analysis can currently be performed using high-throughput RNA sequencing technology.For smaller panels, microarray or RT-PCR based tests can be used to detect relative differences in gene or microRNA expression.

Molecular Assays for Cytologically Indeterminate Thyroid Nodules
The latest molecular tests for cytologically indeterminate thyroid nodules offer varying degrees of risk stratification, but from a clinical management standpoint, test results can be conceptualized into three broad tiers based on probability of cancer (Figure 1): indicates that the cancer risk associated with a negative molecular testing result is less than 5%, similar to that of a cytologically benign nodule.
(1).High-probability results, where the probability of cancer is so high that thyroidectomy is indicated for therapeutic purposes (the decision between lobectomy versus total thyroidectomy may be informed by clinical, sonographic, cytologic, and molecular features of a nodule).(2).Intermediate-probability results, for which a diagnostic lobectomy is recommended for definitive nodule classification and in many cases may suffice from a therapeutic standpoint.(3).Low-probability results, where the cancer risk is similar to that of cytologically benign aspirates, for which clinical surveillance would be adequate.
An important feature shared by each of the tests discussed below is a high negative predictive value (NPV), which reflects the test's ability to recognize the molecular profiles of nodules that are very likely to be benign and can be followed by sonographic surveillance instead of diagnostic surgery.
In the United States, three main molecular tests have been commercially developed over the past decade for cytologically indeterminate thyroid nodules: ThyroSeq v3 (University of Pittsburgh Medical Center and Sonic Healthcare USA, Rye Brook, NY, USA), ThyGeNEXT and ThyraMIR (Interpace Diagnostics, Parsippany, NJ, USA), and Afirma Gene Sequencing Classifier (GSC) and Xpression Atlas (Veracyte, South San Francisco, CA, USA).In their current forms, each of these tests use a combination of genotyping and mRNA or microRNA expression profiling to refine the preoperative cancer risk of cytologically indeterminate thyroid nodules (Table 2).

ThyroSeq v3
Early versions of molecular tests risk-stratify cytologically indeterminate nodules based on the detection of oncogenic alterations involved a small panel of the most common driver mutations and fusions involving seven genes (BRAF, HRAS, KRAS, NRAS, RET-PTC1, RET-PTC3, PAX8-PPARG).A small genotyping panel stratified nodules into those with high and intermediate probabilities for cancer but was limited by a low negative predictive value (NPV).In reports of institutional experiences with the seven-gene panel, the NPV ranged from 82-94% among Bethesda III/IV aspirates (NIFTP/cancer prevalence of 14-32%); a negative test result was thus associated with a residual cancer risk of 6-18%, making it suboptimal for ruling out malignancy [53][54][55][56].
Over the past decade, the ThyroSeq test has utilized the high throughput of targeted next-generation DNA and RNA sequencing (NGS) platforms as well as data from several large-scale genomic characterizations of thyroid cancers to expand the test panel to 12,135 mutations, insertions, and deletions in 112 genes and over 120 different types of gene fusions.Gene expression alterations and chromosomal copy-number alterations (characteristic of Hürthle cell neoplasms) are also examined using RNA and DNA sequencing, respectively [57].
For thyroid FNAs classified as Bethesda III or IV, nucleic acid for ThyroSeq testing can be extracted from cellular material collected directly into nucleic acid preservative (provided by the vendor), formalin-fixed paraffin-embedded (FFPE) cellblock preparations, and more recently, cells on direct smear slides [58].ThyroSeq's limited gene expression analysis serves as quality control steps, confirming adequate thyroid follicular cell sampling as well as identifying aspirates that are not of thyroid follicular cell origin (e.g., medullary thyroid carcinoma (MTC), parathyroid, metastatic tumors).The sequencing results of samples that pass these quality control steps are analyzed by bioinformatics pipelines and assigned a Genomic Classifier (GC) score based on the number, type, and allelic frequency of the genetic alterations that are identified [57].ThyroSeq currently offers six tiers of test results, which are stratified by probability of cancer or NIFTP, tumor type, and risk of recurrence, which inform their recommendations for patient management.
ThyroSeq v3 was clinically validated in a prospective, multi-institutional, blinded study [59] (Table 3).Both test results and histopathologic classifications were binned into binary outcomes for the purposes of measuring test performance.For ThyroSeq v3 results, the absence of oncogenic alterations and detection of low-risk alterations (particularly those that were detected at low levels by NGS) were pooled together as negative test results, while the remaining alterations covering a spectrum of tumor types, cancer probabilities, and recurrence risk were collectively considered positive test results.For the gold standard diagnoses, positive outcomes consisted of NIFTP as well as low-risk and high-risk cancers (jointly reported as "NIFTP/cancer" in this review, because NIFTP is currently considered a surgical disease for which lobectomy is diagnostically necessary and therapeutically sufficient [60]).The clinical validation study involved 154 Bethesda III and 93 Bethesda IV nodules (combined NIFTP and cancer prevalence of 28%) that were tested with ThyroSeq v3 and reference histopathologic diagnosis.ThyroSeq v3 had a sensitivity of 94% and negative predictive value (NPV) of 97% for NIFTP/cancer in the study population, corresponding to a NIFTP/cancer risk of approximately 3% if no oncogenic alterations were identified.The clinical validation study also reported 82% specificity and 66% positive predictive value (PPV).Importantly, in real-world practice, neither the test results nor histopathologic outcomes are binary.Therefore, PPV as reported in these studies offers a limited view of test performance.

ThyGeNEXT/ThyraMIR
To improve the NPV of the genotyping approach for refining the risk stratification of cytologically indeterminate nodules, Interpace Diagnostics combines a relatively focused genotyping panel (ThyGeNEXT) with a second test: a microRNA expression classifier optimized to have a high NPV for thyroid cancer.For aspirates classified as Bethesda III or IV, testing can be performed either from cells collected by dedicated FNA passes into a vial of nucleic acid preservative or from cells on direct smear slides that were routinely prepared for cytologic evaluation [61].Samples first undergo evaluation by ThyGeNEXT, a targeted DNA and RNA NGS panel that currently includes hotspot mutations in 10 genes and 38 gene fusions [62].Adequate sampling of thyroid follicular cells is determined by a limited gene expression panel that includes NKX2.1 (TTF-1) and PAX8.Detection of driver alterations associated with a high probability of cancer (e.g., BRAF V600E mutation, BRAF fusions, RET fusions, TERT promoter mutations, ALK mutations and fusions)-reported by Interpace Diagnostics as "strong" driver mutations-require no further testing.Samples that are either (a) negative for alterations in the ThyGeNEXT panel or (b) positive for a "weak" driver alteration (e.g., mutations in RAS-family genes) have an intermediate probability of cancer and are risk-stratified further using ThyraMIR.
ThyraMIR uses quantitative RT-PCR to measure the relative expression levels of 10 microRNAs, the results of which are sorted into three risk-based categories.These microRNA expression profiles are combined with the results of the mutation/fusion panel to assign a global estimate of cancer risk for each nodule.Nodules that are negative for both ThyGeNEXT and ThyraMIR are considered to be low risk for NIFTP/cancer, while other combinations of ThyGeNEXT and ThyraMIR results are aligned with a range of NIFTP/cancer risks based on the specific mutations, gene fusions, and microRNA profile that are detected.From a clinical standpoint, the combined test results can be stratified into three tiers (negative, moderate, and positive with respect to cancer probability), warranting consideration of clinical surveillance, lobectomy, or total thyroidectomy, respectively.Medullary carcinoma is recognized by upregulation of specific microRNAs (e.g., miR-375) in the ThyraMIR panel [63].
The combined ThyGeNEXT/ThyraMIR testing approach was clinically validated in a retrospective study of archival, cytologically indeterminate thyroid FNA smears with matched resection specimens that were collected from four institutions [64] (Table 2).Reference histopathologic diagnoses were divided into benign nodules versus those that would currently warrant resection (NIFTP and cancer).Among the 178 Bethesda III and IV nodules in the study, the prevalence of NIFTP and cancer was 30%.This clinical validation study diverges from those of ThyroSeq v3 and Afirma GSC in two important ways.First, the ThyroSeq/ThyraMIR validation study takes a unique approach to address the limitations of using a conventional 2 × 2 matrix for assessing diagnostic accuracy of non-binary classification tests.Although ThyGeNEXT/ThyraMIR results are binned into three tiers as described above, the middle tier (moderate-risk) test results are alternatively considered "positive" or "negative", depending on the statistical measure being calculated: moderate-risk results are grouped together with positive test results for calculating test sensitivity and NPV, while the same moderate-risk results are grouped with negative test results for calculating test specificity and PPV.Second, the authors also noted that the prevalence of NIFTP and cancer among samples selected for their validation cohort (30%) was higher than what was found in prior studies [65].Bayes' theorem was thus applied to extrapolate the expected rate of negative, moderate, and positive test results at a lower (14%) NIFTP/cancer prevalence.Altogether, using these two adjustments, Lupo et al. report 95% sensitivity, 90% specificity, 97% NPV, and 75% PPV for ThyGeNEXT/ThyraMIR.For the purposes of comparison with the other commercially available molecular tests described herein, we also provide corresponding values for test performance that were calculated using conventional methods (i.e., considering moderate-risk test results as "positive"; no prevalence adjustment): 93% sensitivity, 62% specificity, 95% NPV, and 52% PPV.Importantly, with both approaches, cytologically indeterminate aspirates that are negative for both ThyGeNEXT and ThyraMIR have a low cancer risk (3-5%) based on 95-97% NPV of the combined test.

Afirma GSC and Xpression Atlas
Afirma refines the cancer risk of cytologically indeterminate thyroid nodules by evaluating the expression profile of a large panel of genes and using machine-learning algorithms to classify aspirates as having either a "Benign" or "Suspicious" gene expression profile.The initial version of the test, introduced in 2012 as the Afirma Gene Expression Classifier (GEC), relied on expression microarrays to determine the expression profile of 142 genes [66].Over the past decade, Afirma has upgraded their microarray-based GEC to a high-throughput RNA sequencing platform that determines the expression profile of 10,196 genes.Updated machine-learning algorithms were applied to develop the current Gene Sequencing Classifier (GSC), which risk-stratifies aspirates based on the expression pattern of 1115 core genes; the remaining ~9000 genes, whose expression levels are measured, reportedly provide stability to the classifier model.The GSC has also added mitochondrial transcripts and loss-of-heterozygosity analysis to its test panel in an effort to improve risk stratification of Hürthle cell lesions [67].
Afirma GSC requires dedicated FNA passes to be collected into a vial of nucleic acid preservative, in addition to the cellular material collected for routine cytological evaluation.For aspirates classified as Bethesda III or IV, the concurrent material collected for molecular testing is processed for RNA sequencing.Expression profiles that indicate sampling of parathyroid tissue or medullary carcinoma (Afirma MTC test) are reported as such and require no further testing.Similarly, the expression profile associated with BRAF V600E mutation (Afirma BRAF test) as well as detection of RET-PTC1 or RET-PTC3 gene fusions are highly specific for papillary carcinoma and do not require further analysis by the GSC.For the remaining samples that are (a) negative for the above markers and (b) confirmed to have adequate thyroid follicular cell content, the GSC's proprietary algorithm classifies each sample as having either a "Benign" or a "Suspicious" expression profile.
The original version of the Afirma test (GEC) was clinically validated in a multicenter prospective study involving 210 Bethesda III and IV aspirates with matched resection specimens [66].Archival patient samples remaining from this original GEC validation study were used to validate the current version of Afirma (GSC); 190 Bethesda III/IV samples had sufficient RNA available for the GSC validation, with similar prevalence of NIFTP/cancer (24%) to the original 210-sample GEC validation set [68] (Table 2).The GSC recognized 54% of Bethesda III/IV nodules in the validation set as having a "Benign" gene expression profile, compared to the 41% benign-call rate of the GEC.Compared to the GEC (90% sensitivity, 52% specificity, 94% NPV, and 37% PPV), the GSC showed better specificity (68%) and PPV (47%) for NIFTP/cancer, while maintaining the high sensitivity (91%) and NPV (96%) of the original test.Altogether, a Benign GSC result is thus associated with an approximate 4% risk of NIFTP/cancer (corresponding to 1-NPV), while a Suspicious GSC result confers a NIFTP/cancer risk of approximately 47% (corresponding to the PPV).Among the 26 Hürthle cell lesions (9 Hürthle cell carcinomas and 17 Hürthle cell adenomas) in the validation set, the GSC showed improved specificity (59%), NPV (91%), and PPV (53%) compared to the GEC (12% specificity, 67% NPV, and 35% PPV).
While measurement of gene expression levels by RNA sequencing is the core methodology for the GSC, the sequencing data can also be mined for point mutations, insertions/deletions, and gene fusions.Sequence variants that are known to be associated with thyroid cancer are reported in a complementary test known as Afirma Xpression Atlas [69].As an RNA sequencing-based test, Xpression Atlas currently reports 235 fusions and 905 sequence variants from only the transcribed portion of 593 genes [70].Alterations in non-coding DNA, such as those in the TERT promoter region, are not detected by the Xpression Atlas.Like ThyroSeq and ThyGeNEXT, the Afirma Xpression Atlas provides patients and clinicians with estimates of NIFTP/cancer risk and tumor type based on the specific variant or fusion that is identified.As discussed above for genotyping panels, the detection of oncogenic mutations and gene fusions has potential benefits beyond diagnostic utility.Identification of actionable driver alterations may guide the selection of targeted therapy for patients with advanced thyroid cancers, and detection of alterations suggestive of heritable cancer syndromes may steer patients towards genetic counseling and testing for germline alterations.

Conclusions
While ThyroSeq v3, ThyGeNEXT/ThyraMIR, and Afirma GSC/Xpression Atlas each take different approaches to refine the cancer risk stratification of thyroid nodules, common themes have emerged as these tests have evolved over the years.A high NPV to help rule-out malignancy is key among these similarities, as are combined testing approaches that use aspects of both genotyping and gene or microRNA expression profiling to offer more granular estimates of tumor phenotype and cancer risk.Ultimately, each of these three testing platforms appear to provide similar information to the patient and clinician in terms of guiding management decisions.The molecular tests described in this review have primarily been validated and used in the United States.International differences in the application and impact of molecular testing for thyroid FNA samples have been reported and likely reflect contrasting management strategies for cytologically indeterminate nodules as well as differences in access to testing among different countries [25].In our increasingly interconnected world, we expect that global discussions about our diverse risk-stratification approaches for thyroid nodules will help us move towards safer and more cost-effective care for patients worldwide.

Figure 1 .
Figure1.Model illustrating role of molecular testing for refining the cancer risk stratification of cytologically indeterminate (Bethesda III or IV) thyroid nodules.Current versions of molecular tests described in this review offer varying degrees of risk stratification, but for the most part, test results fall into one of three categories: low, intermediate, or high probability of cancer.A high negative predictive value (NPV) is a key feature shared by all the tests described herein.Among most cohorts of Bethesda III/IV nodules, a NPV greater than 95% (as demonstrated by clinical validation studies) indicates that the cancer risk associated with a negative molecular testing result is less than 5%, similar to that of a cytologically benign nodule.

Table 1 .
Classification schemes for reporting thyroid cytology.Indeterminate diagnostic categories are shaded.

Table 2 .
Comparison of testing methodology, biomarkers, and quality control measures for the three commercially available thyroid molecular testing platforms in the United States.

Table 3 .
Comparison of clinical validation studies for three commercially available molecular tests for cytologically indeterminate thyroid nodules.: GSC, gene sequencing classifier; GEC, gene expression classifier; Bethesda III, Atypia/Follicular Lesion of Undetermined Significance; Bethesda IV, (Suspicious for) Follicular Neoplasm; NIFTP, noninvasive follicular thyroid neoplasm with papillary-like nuclear features; NPV, negative predictive value; PPV, positive predictive value.* For the purposes of comparison with the other two tests in this table, ThyGeNEXT/ThyraMIR results with intermediate and high probabilities of NIFTP/cancer were considered "positive" test results, while only those with low probability of cancer (i.e., negative for both ThyGeNEXT and ThyraMIR) were considered "negative" test results.The ThyGeNEXT/ThyraMIR clinical validation study describes different metrics for test performance following prevalence adjustment and modification of how moderate-risk test results are classified for statistical analysis: 95% sensitivity, 90% specificity, 97% NPV, and 75% PPV. Abbreviations