Is Gross Extrathyroidal Extension to Strap Muscles (T3b) Only a Risk Factor for Recurrence in Papillary Thyroid Carcinoma? A Propensity Score Matching Study

Simple Summary In papillary thyroid carcinoma (PTC), staging classification of gross and minimal extrathyroidal extension (ETE) has been recently modified in the eighth edition of the American Joint Commission on Cancer/Union for International Cancer Control (AJCC/UICC) TNM staging system. In this study, we compared the clinicopathological characteristics and recurrence rates between minimal and gross ETE. No significant differences in the recurrence and disease-free survival rates were found between the two groups. Whether gross ETE invading strap muscles (T3b) only could be a risk factor for recurrence in PTC remains questionable. Abstract The presence of extrathyroidal extension (ETE) is associated with locoregional recurrence and distant metastases in papillary thyroid carcinoma (PTC). This study was designed to compare the recurrence risk between minimal ETE (mETE) and gross ETE (gETE) in patients with PTC using propensity score matching. In this study, 4452 patients with PTC who underwent thyroid surgery in a single center were retrospectively analyzed, and clinicopathological characteristics were compared according to the ETE status. Disease-free survival (DFS) and recurrence risk were compared between mETE and gETE after propensity score matching. The mean follow-up duration was 122.7 ± 22.5 months. In multivariate analysis, both mETE and gETE were not associated with recurrence risk before propensity score matching (p = 0.154 and p = 0.072, respectively). After propensity score matching, no significant difference in recurrence rates was observed between the two groups (p = 0.668). DFS of the gETE group did not significantly differ from that of the mETE group (log-rank p = 0.531). This study revealed that both mETE and gETE are not independent risk factors for the risk of recurrence in PTC. Our findings suggest that gETE invading strap muscles only might not be associated with worse oncological outcomes in PTC.


Introduction
Extrathyroidal extension (ETE) is a risk factor for prognosis in patients with papillary thyroid carcinoma (PTC) [1,2]. As defined by the American Joint Commission on Cancer/Union for International Cancer Control (AJCC/UICC), ETE can be classified into gross ETE (gETE), which is visually confirmed intraoperatively by surgeons, and minimal ETE (mETE), defined as tumor cells extending to strap muscles or perithyroidal tissue and confirmed by pathological review [3]. The diagnosis of mETE can be challenging because histopathological findings of mETE usually vary among pathologists. Recently, mETE was excluded from the T3 classification in the eighth edition of the AJCC/UICC TNM staging system [4]. gETE to strap muscles alone has now been classified as a T3b tumor, gETE to subcutaneous soft tissue, the larynx, the trachea, the esophagus, or laryngeal nerve is considered a T4a tumor, and invasion of the prevertebral fascia, the carotid artery, or mediastinal vessels is classified as a T4b tumor [4]. Therefore, this modification in the eighth edition of the AJCC/UICC TNM staging system has led to the downstaging of many patients [5]. However, the American Thyroid Association (ATA) management guidelines still consider the presence of mETE as a feature of an intermediate risk of recurrence, regardless of TNM staging modification [6].
The role of mETE as a risk factor for recurrence remains controversial. Some studies have compared the outcomes between no ETE and mETE, between mETE and gETE, or among the three groups together [7][8][9]. Danilovic et al. have reported that both mETE and gETE are independent risk factors for recurrence in PTC [10]. Park et al. have examined 381 patients with PTC and found mETE to be correlated with aggressive histopathological features and tumor recurrence, concluding that patients with mETE have poorer clinical outcomes than those without ETE [11]. In contrast, several studies evaluating differentiated thyroid carcinoma (DTC) with mETE without lymph node (LN) metastases have found no statistically significant increase in the risk of recurrence [12][13][14].
The assessment of ETE is a key factor not only in establishing the staging system but also in determining the patient's surgical extent, adjuvant treatment, and the intensity of surveillance during follow-up. As mentioned earlier, several studies have compared the prognosis of each ETE group; however, most studies had selection bias due to their retrospective designs, making it difficult to reach significant conclusions. Therefore, this study was designed to compare clinicopathological characteristics and long-term oncological outcomes among different degrees of ETE using propensity score matching analysis to reduce selection bias in patients with PTC. Moreover, we performed a sub-analysis to identify the clinical significance of ETE in patients with papillary thyroid microcarcinoma (PTMC).

Patients
We retrospectively reviewed 4591 patients with PTC who underwent thyroid surgery from March 2008 to June 2014 at Seoul St. Mary's Hospital (Seoul, Korea). In total, 84 and 55 patients were excluded from the analysis because of insufficient data and loss to followup, respectively. The medical charts and pathology reports of 4452 patients were reviewed and analyzed. Of the patients, 1137 (25.5%) underwent lobectomy and/or contralateral partial thyroidectomy (less than total thyroidectomy (TT)) and 3315 (74.5%) underwent TT.
The mean follow-up duration was 122.7 ± 22.5 months (range, 92-167 months). This study was conducted in accordance with the Declaration of Helsinki (as revised in 2013). This study was approved by the Institutional Review Board of Seoul St. Mary's Hospital, Catholic University of Korea (IRB No. KC22RISI0041), which waived the requirement for informed consent due to the retrospective nature of this study.

ETE Definition
According to the T stage classification based on the eighth edition of the AJCC/UICC TNM staging system, the T3b stage grossly invades strap muscles only, and the surgeon writes it on the operation record after confirmation during surgery [4]. Since the patients included in this study were admitted from 2008 to 2014, we referred to the pathology reports in that period. This is because the eighth edition of the AJCC/UICC staging system was revised in 2016. In this study, the gETE group included T3b, that is, invading strap muscles only, and excluded T4a or T4b. mETE is defined as extrathyroidal invasion restricted to perithyroidal soft tissues, including microscopic strap muscle invasion [3]. The mETE group was classified according to this definition in this study.

Follow-Up Assessment
Postoperative care and follow-up were performed according to the ATA management guidelines [6]. For the follow-up, all patients underwent physical examination, serum thyroid function tests, measurement of thyroglobulin and anti-thyroglobulin antibody concentrations, and neck ultrasonography every 3-6 months for the first year and annually after that. Radioactive iodine (RAI) ablation was performed 6-8 weeks after TT using doses based on the ATA management guidelines, and whole-body scans (WBS) were performed 5-7 days after RAI ablation. During routine follow-up evaluation, patients with suspected recurrence underwent additional diagnostic imaging tests, including computed tomography, positron emission tomography/computed tomography, and/or RAI WBS, to determine the location and extent of recurrence. Disease recurrence was confirmed using imaging modalities and/or pathological diagnosis using ultrasound-guided fine-needle aspiration/core needle biopsy or a surgical biopsy specimen.

Primary and Secondary Endpoints
The primary endpoint was a comparison of disease-free survival (DFS) between the mETE and gETE groups after propensity score matching, and the secondary endpoint was a comparison of clinicopathological characteristics between the two groups before and after propensity score matching analysis.

Statistical Analysis
Continuous variables are presented as means with standard deviations, and categorical variables are reported as numbers with percentages. Student's t-test was used to compare continuous variables. We compared the differences in categorical clinicopathological characteristics among the ETE groups using Pearson's chi-square test or Fisher's exact test. Univariate Cox regression analyses were performed to validate DFS predictors, and statistically significant variables were analyzed using a multivariate Cox proportional hazard model. Hazard ratios (HRs) with 95% confidence intervals (CIs) were calculated. DFS was compared using Kaplan-Meier survival analysis, and the log-rank test was used to calculate significant differences.
We performed propensity score matching analysis using various clinicopathological characteristics to reduce the impact of selection bias and potential ambiguity. Individual patient propensity scores were calculated using logistic regression analysis. Patients with mETE were matched to those with gETE at a 1:1 ratio. After propensity score matching, DFS and long-term oncological outcomes were compared between the mETE and gETE groups. DFS predictors after propensity score matching were validated using univariate and multivariate Cox regression analyses, similar to that before propensity score matching. Differences with p-values of less than 0.05 were considered statistically significant. All statistical analyses were performed using Statistical Package for the Social Sciences (version 24.0; IBM Corp., Armonk, NY, USA).

Comparison of Baseline Clinicopathological Characteristics between the mETE and gETE
Groups after Propensity Score Matching Table 3 shows the results of the comparison of the clinicopathological characteristics of the mETE and gETE groups after propensity score matching. Propensity score matching yielded 213 matched pairs of patients. After propensity score matching, there were no significant differences in clinicopathological characteristics between the matched groups. Ten (4.7%) patients in the mETE group and thirteen (6.1%) patients in the gETE group had recurrence; however, this result was not statistically significant (p = 0.668).  Table 4). The DFS curves of mETE and gETE after propensity score matching are illustrated using Kaplan-Meier survival analysis (Figure 2). DFS of the gETE group did not significantly differ from that of the mETE group (log-rank p = 0.531).  Table 3 shows the results of the comparison of the clinicopathological characteristics of the mETE and gETE groups after propensity score matching. Propensity score matching yielded 213 matched pairs of patients. After propensity score matching, there were no significant differences in clinicopathological characteristics between the matched groups. Ten (4.7%) patients in the mETE group and thirteen (6.1%) patients in the gETE group had recurrence; however, this result was not statistically significant (p = 0.668).

Sub-Analysis of Clinicopathological Characteristics According to ETE Status in PTMC
Sub-analysis of baseline clinicopathological characteristics of the patients with PTMC according to ETE status is summarized in Table 5. The mean age of the gETE group was significantly higher than that of the mETE group (51.3 ± 11.3 years vs. 47.6 ± 11.4 years; p

Sub-Analysis of Clinicopathological Characteristics According to ETE Status in PTMC
Sub-analysis of baseline clinicopathological characteristics of the patients with PTMC according to ETE status is summarized in Table 5. The mean age of the gETE group was significantly higher than that of the mETE group (51.3 ± 11.3 years vs. 47.6 ± 11.4 years; p = 0.018). Patients in the gETE group underwent significantly more extensive surgeries than those in the mETE group (p = 0.006). The mean tumor size was significantly larger in the gETE group than in the mETE group (p < 0.001). A significantly higher prevalence of bilaterality was observed in the gETE group than in the mETE group (37.5% vs. 24.7%; p = 0.046). The gETE group had significantly more advanced N stage than the mETE group (p = 0.017). RAI therapy was performed more frequently in the gETE group (63.0% vs. 85.7%; p = 0.001). However, no significant differences in the recurrence rates were found between the mETE and gETE groups (2.7% vs. 1.8%; p = 1.000). Data are expressed as number of patients (%), or mean ± standard deviation. A statistically significant difference was defined as p < 0.05. Abbreviation: ETE, extrathyroidal extension; PTMC, papillary thyroid microcarcinoma; TT, total thyroidectomy; mRND, modified radical neck dissection; LN, lymph node; T, tumor; N, node; M, metastasis; RAI, radioactive iodine; NA, not applicable. Table 6 shows the risk factors for recurrence in PTMC. The number of positive LNs (HR, 1.123; 95% CI, 1.034-1.219; p = 0.006) and RAI therapy (HR, 3.890; 95% CI, 2.030-7.452; p < 0.001) were considered significant predictors of recurrence. However, both mETE (HR, 1.039; 95% CI, 0.607-1.779; p = 0.889) and gETE (0.522; 95% CI, 0.069-3.928; p = 0.527) were not identified as risk factors for recurrence in the multivariate analysis. Kaplan-Meier analysis showed that DFS did not significantly differ among the three groups (log-rank p = 0.065) (Figure 3).

Discussion
In this study, no significant difference in recurrence rates was observed between the mETE and gETE groups. To reduce the effects of selection bias, propensity score matching was performed to adjust for several clinicopathological characteristics between the mETE and gETE groups. Our results suggest that gETE and mETE have similar long-term oncological outcomes.
The AJCC/UICC TNM staging system is recommended for patients with DTC based on its usefulness in predicting disease prognosis. From January 2018, the eighth edition of the AJCC/UICC TNM staging system has been applied to overcome several limitations identified in its seventh edition [4]. The modification of the age cutoff from 45 to 55 years is a major change in the eighth edition. Several studies have suggested that the age of 45 years may not statistically be the cutoff value for the staging system [15,16]. Another change is a decrease in the unfavorable prognostic significance of cervical LN metastases. The definition of central neck (N1a) was expanded to include level VII in addition to level VI. Another notable change in the eighth edition is the definition of the T classification of thyroid cancer. The seventh edition of the AJCC/UICC TNM staging system classified patients with mETE as T3 [3,17]. However, in the eighth edition, mETE with tumors of ≤4 cm in size was excluded from the T3 classification. Tumors with gETE invading strap muscles only were classified as T3b [4,5]. Several patients with mETE were reclassified as T1 or T2 based on their primary tumor size in the eighth edition.
gETE was long recognized as a factor that adversely affected prognosis in PTC. Several studies have shown that gETE is closely related to risk factors for recurrence and diseasespecific death [18,19]. Victoria et al. have demonstrated that ETE invading strap muscles alone (T3b) increased the risk of disease-specific death [20]. Several studies have compared mETE with gETE or no ETE in terms of whether mETE affects the prognosis of PTC. Ito et al. and Arora et al. have revealed that patients with gETE had a higher recurrence risk than those with mETE [7,9]. Subsequent studies have shown that mETE had no significant effects on local recurrence and survival [12,14,21,22]. In contrast, other studies have suggested that mETE has a prognosis similar to that in gETE [23,24]. Recently, Debora et al. have concluded that the presence of mETE should still be considered an intermediate-risk factor for recurrence, suggesting that both mETE and gETE are independent risk factors for the risk of recurrence in PTC, except for microcarcinomas without LN metastases [10]. However, the prognostic significance of mETE remains controversial. Therefore, we compared the oncological outcomes between mETE and gETE in PTC using propensity score matching to reduce selection bias. Our data revealed a significant difference in DFS among the three groups in Kaplan-Meier survival analysis before propensity score matching (log-rank p < 0.001). After propensity score matching, however, the recurrence rates in the mETE and gETE groups were 4.7% and 6.1%, respectively, which were not statistically significantly different (p = 0.668). No significant difference in DFS (log-rank p = 0.531) was observed between the mETE and gETE groups. Thus, our findings suggest that patients with gETE invading strap muscles only should undergo a more conservative surgery or staging should be modified in patients in the T classification.
The BRAFV600E mutation has been identified as the most common and specific genetic mutation in PTC, with a prevalence ranging from 37% to 83% [25,26]. In this study, 79.8% of the patients had the BRAFV600E mutation. This result is consistent with those reported in previous studies. The BRAFV600E mutation is associated with more aggressive clinicopathological characteristics and a poorer prognosis of PTC [27,28]. The BRAFV600E mutation is significantly associated with ETE in patients with PTC, including PTMC [29,30]. Lee et al. have predicted ETE before surgery depending on the presence or absence of the BRAFV600E mutation [31]. Similar to the results of other studies, the BRAFV600E mutation was higher in the mETE and gETE groups than in the no ETE group in this study. However, no significant difference was observed between the mETE and gETE groups. Further studies on the BRAFV600E mutation should be conducted to clarify its correlation with ETE.
Our data suggest that LN metastasis is an independent risk factor for DFS both before and after propensity score matching. In previous studies, the presence of LN metastasis has not been regarded as a factor affecting risk stratification, which differed from other tumor factors, such as tumor size or aggressive histological features [32,33]. LN metastasis is not considered an independent factor for prognosis, although LN metastasis has prognostic importance in older patients [34,35]. In contrast, Liu et al. have reported that the recurrence and disease-specific mortality rates were higher in the LN metastasis group at the 10-year follow-up [36]. Several studies have suggested that the presence of LN metastases in PTC was an independent predictor of locoregional recurrence, distant metastases, or survival [37][38][39][40]. However, whether LN metastasis influences the recurrence and mortality rates in patients with PTC remains controversial.
In this study, the rate of receiving postoperative RAI therapy among all patients was 53.6%. The proportion of patients receiving RAI therapy was significantly higher in the gETE groups than in the mETE group (73.1% vs. 91.6%; p < 0.001), and a similar tendency was also observed in PTMC. According to the ATA management guidelines, RAI therapy is considered in intermediate-risk patients and is routinely recommended for high-risk patients [6]. Since RAI therapy was determined according to the risk stratification guidelines, more patients with gETE received RAI therapy. Numerous studies have reported that RAI therapy could significantly reduce recurrence and mortality in PTC [38,41]. However, it was revealed that RAI therapy was not an independent risk factor for DFS in this study. The fact that a higher proportion of patients with gETE received RAI therapy may have influenced the outcome that RAI therapy was not an independent risk factor for recurrence.
We performed a sub-analysis of ETE as a prognostic factor in patients with PTMC. The incidence of ETE in PTMC varied, ranging from 4.5% to 31.9% [42][43][44]. In this study, the incidence of mETE and gETE was 36.4% and 1.7%, respectively. The recurrence rate in patients with PTMC ranged from 3% to 16.7% [42][43][44]. The recurrence rate in a study involving patients with PTMC with mETE alone was 3.8%, which was comparable to 2.7% for the mETE group in this study [40]. Several studies have compared mETE with no ETE as a prognostic factor in patients with PTMC [13,45,46]. However, few studies have compared mETE with gETE in patients with PTMC due to the small number of ETE cases in PTMC, particularly in gETE. No significant difference in long-term oncological outcomes was observed between the mETE and gETE groups. Multicenter studies with larger samples are needed to investigate the correlation of ETE with long-term outcomes in PTMC.
This study has several limitations. First, this study adopted a retrospective singlecenter study design. There may be a selection bias because the data were collected at a single tertiary institution, which did not represent the entire patient population. A histological diagnosis of mETE and gETE could be variable and, to some extent, subjective, because it may vary among pathologists or surgeons. Additionally, several patients in the mETE and gETE groups, including PTMC, received RAI therapy, which may have affected recurrence or survival. Finally, the mean follow-up period was short (122.7 ± 22.5 months). Longer follow-up is necessary to determine the prognosis of patients with PTC, as it has indolent features.
The most important strength of this study is that we performed propensity score matching to adjust for differences in clinicopathological characteristics and minimize selection bias, which yielded more reliable results. Moreover, this study involved one of the largest cohorts of patients with PTC who underwent surgery (n = 4452). To the best of our knowledge, few studies have evaluated the impact of mETE and gETE on the long-term prognosis of PTMC.

Conclusions
This study demonstrated that both mETE and gETE were not independent risk factors for recurrence in PTC. This observation suggests that gETE invading strap muscles alone might not negatively affect the oncological outcomes in PTC. Our findings could affect the decision-making for patients with gETE invading strap muscles only. Further studies are required to determine the modification of gETE invading strap muscles alone in the T classification.