Clinical-Radiomics Nomogram Based on Contrast-Enhanced Ultrasound for Preoperative Prediction of Cervical Lymph Node Metastasis in Papillary Thyroid Carcinoma

Simple Summary Improving the precision of preoperative LNM assessment is crucial for determining the scope of PTC surgery, reducing complications, and preventing recurrence. Few studies have applied radiomics analysis based on contrast-enhanced ultrasound (CEUS) to the prediction of LNM in PTC. Our study found that CEUS-based radiomics, as a promising quantitative analysis, provides incremental value to clinical prediction and management of LNM in PTC. In addition, the developed clinical-radiomics nomogram demonstrated promising value for predicting LNM. It may be an effective, noninvasive tool for preoperative prediction of LNM in clinical use. Abstract This study aimed to establish a new clinical-radiomics nomogram based on ultrasound (US) for cervical lymph node metastasis (LNM) in papillary thyroid carcinoma (PTC). We collected 211 patients with PTC between June 2018 and April 2020, then we randomly divided these patients into the training set (n = 148) and the validation set (n = 63). 837 radiomics features were extracted from B-mode ultrasound (BMUS) images and contrast-enhanced ultrasound (CEUS) images. The maximum relevance minimum redundancy (mRMR) algorithm, least absolute shrinkage and selection operator (LASSO) algorithm, and backward stepwise logistic regression (LR) were applied to select key features and establish a radiomics score (Radscore), including BMUS Radscore and CEUS Radscore. The clinical model and clinical-radiomics model were established using the univariate analysis and multivariate backward stepwise LR. The clinical-radiomics model was finally presented as a clinical-radiomics nomogram, the performance of which was evaluated by the receiver operating characteristic curves, Hosmer–Lemeshow test, calibration curves, and decision curve analysis (DCA). The results show that the clinical-radiomics nomogram was constructed by four predictors, including gender, age, US-reported LNM, and CEUS Radscore. The clinical-radiomics nomogram performed well in both the training set (AUC = 0.820) and the validation set (AUC = 0.814). The Hosmer–Lemeshow test and the calibration curves demonstrated good calibration. The DCA showed that the clinical-radiomics nomogram had satisfactory clinical utility. The clinical-radiomics nomogram constructed by CEUS Radscore and key clinical features can be used as an effective tool for individualized prediction of cervical LNM in PTC.


Introduction
Thyroid cancer (TC) ranked ninth among the incidence of human malignancies worldwide [1], and papillary thyroid carcinoma (PTC) is the most common pathological type among TC, accounting for 80-90% of cases [2]. PTC often has a good prognosis and a low mortality rate [3]; however, some PTCs exhibit cervical lymph nodes metastasis (LNM),

Image Acquisition and Clinicoradiological Characteristics Collection
Ultrasonography was performed using a GE LOGIQ E9 color Doppler ultrasonic instrument with a 9L linear array probe (2)(3)(4)(5)(6)(7)(8)(9). The US physician first performed a BMUS examination of the thyroid gland, and saved the BMUS image of the largest long-axis section of the lesion, then switched to real-time CEUS mode. Next, the US physician asked the patient to breathe calmly and tried to keep the observation section unchanged. The contrast agent used was the SonoVue (Bracco, Milan, Italy). The patient received a bolus injection of 2.4 mL contrast agent through the antecubital vein, followed immediately by 5 mL of normal saline. The US physician observed the dynamic perfusion process of the lesion continuously and stored the dynamic images. A frame of the image at the peak time of CEUS was selected to store. Finally, two images of each nodule (BMUS image and CEUS image) were exported in Dicom format.
The conclusion suggestive of "LNM" in the US report was considered to be US-reported LN status positive. The conclusions of "undetectable LN", "reactive hyperplastic lymph nodes" and "visible LN" in the absence of metastasis were considered to be US-reported LN status negative. According to the 2015 American Thyroid Association (ATA) guidelines [23], the suspicious US signs suggestive of cervical LNM included round shape (aspect ratio > 0.5), calcifications, cystic changes, hyperechogenicity, and peripheral blood flow signals. One or more LNs that met one or more of the five criteria would be considered positive.

Image Segmentation and Feature Extraction
ITK-SNAP software (open source software; http://www.itksnap.org, accessed on 7 August 2020) was used to segment the nodules, and the region of interest (ROI) was outlined along the contour of the targeted lesion. To assess interobserver reproducibility, 30 cases were randomly selected from all cases, and the images were segmented by two US physicians (reader1 and reader2), respectively. One US physician (reader1) performed all image segmentation. Next, the radiomics plug-in of 3D-Slicer software was used to perform feature extraction of the thyroid nodules. Before extracting features, the images were normalized including resampling to a voxel size of 1 mm × 1 mm × 1 mm, setting the bin width parameter in 3D-Slicer at 25 HU to discretize the voxel intensity. 837 radiomics features were extracted from each BMUS image and each CEUS image, respectively, and feature categories included first-order statistics, gray level dependence matrix (GLDM), gray level co-occurrence matrix (GLCM), gray level run length matrix (GLRLM), gray level size zone matrix (GLSZM) and neighborhood gray tone difference matrix (NGTDM).

Feature Selection and Radiomics Score Construction
We divided the 211 patients into a training set (n = 148) and validation set (n = 63) by 7:3 stratified random sampling method, and then radiomics features in the training and validation sets were z-score normalized according to the mean and standard deviation of the training set.
The process of radiomics feature selection and radiomics score construction is as follows. First, we calculated the interclass correlation coefficient (ICC) based on the radiomics features extracted after image segmentation by the two US physicians, and highly reproducible (ICC > 0.75) features were retained. Then the redundant and irrelevant features were removed using the minimum redundancy maximum correlation (mRMR) algorithm, and the best top 30 features from each image were selected. Next, the radiomics features associated with LNM were obtained by using the least absolute shrinkage and selection operator (LASSO) algorithm. Finally, a backward stepwise logistic regression (LR) with Akaike information criterion (AIC) was used to select the features constituting the logistic regression model, and the model score was the radiomics score (Radscore). According to the above process, we obtained BMUS Radscore and CEUS Radscore, respectively. Then a Mann-Whitney U test was used to assess the association of the Radscore with LNM.

Development of the Clinical Model and the Clinical-Radiomics Nomogram
We constructed the model based on the training set and subsequently applied the model in the validation set to test its performance. In the training set, first, we performed a univariate analysis of clinical parameters (including demographic parameters and ultrasound features) and two Radscores. Stepwise multivariate LR analysis was then performed to develop a clinical model using clinical risk factors with p-value < 0.05 in the univariate analysis as candidate predictors.
Clinical risk factors and two Radscores were introduced into multivariate LR to build the clinical-radiomics combined model. A backward stepwise selection process with the AIC as the stopping rule was performed. A nomogram based on the clinical-radiomics model was drawn to visualize the logistic regression model for individualized assessment of patients' risk of cervical LNM.

Model Validation
We plotted the receiver operating characteristic (ROC) curves and evaluated the predictive ability of the clinical-radiomics nomogram by the area under the ROC curve (AUC). Comparisons between the clinical model and the clinical-radiomics model were made using the integrated discrimination improvement (IDI) index.
We used the Hosmer-Lemeshow test and calibration curve to assess the calibration performance of the clinical-radiomics nomogram and used decision curve analysis (DCA) to assess the clinical utility of the clinical-radiomics nomogram by estimating the net benefit of the training set at each threshold probability.

Statistical Analysis
R software and associated packages were used for statistical analyses. Quantitative data were presented as mean ± standard deviation or median ± interquartile ranges. The t-test or Mann-Whitney U test was used to compare the differences in the measurement data between the two groups, and the Chi-square test or Fisher's exact test was used to compare the differences in the enumeration data between the two groups. The difference between the two groups was statistically significant with p < 0.05.

Clinicoradiological Characteristics
The study flowchart and radiomics workflow are reported in Figure 1. This study included 211 patients with solitary PTC, 88 of whom had positive cervical LNM results and 123 of whom had negative cervical LNM results. The patients were randomly divided in a 7:3 ratio, with 148 cases allocated to the training set and 63 cases allocated to the validation set, and the positive rate of cervical LNM was 39.9% and 46.0% in the training and validation sets, respectively, with no statistically significant difference (p = 0.406). Patients in the training and validation sets are listed by their clinical features in Table 1. Between the training and validation sets, there was no statistically significant difference in the clinical characteristics of patients (p > 0.05 for all), indicating that the baseline data were comparable for both sets. Table 2 shows the univariate analysis results between cervical LNM and candidate variables in the two groups. Age, tumor size, and US-reported LN status were associated with LNM in both the training and validation set (p < 0.05). Primary site, echogenicity, margin, microcalcification, and enhancement patterns were not associated with LNM in the training or validation set (p > 0.05). In the training set, males were more likely to have LNM (p < 0.05); but gender was not associated with LNM (p > 0.05) in the validation set. In the validation set, sub-capsular location was more likely to have LNM (p < 0.05); but tumor location was not associated with LNM (p > 0.05) in the training set.

Radiomics Score Building
After removing the less stable features with ICC ≤ 0.75, 768 and 775 features were kept from the BMUS and CEUS images of each patient, respectively; 30 features were retained in each image by the mRMR algorithm. After the LASSO regression ( Figure 2), 2 features from BMUS images and 10 features from CEUS images were selected. After the backward stepwise logistic regression analysis, 1 radiomics feature from the BMUS and 5 radiomics features from CEUS images were found to associate with LNM and used to construct the BMUS radiomics score and CEUS radiomics score, respectively. The formulas for BMUS Radscore and CEUS Radscore were as follows:

Model Building and Validation
The univariate results showed significant differences (p < 0.05) in age, gender, tumor size, and US-reported LN status between the LNM positive and negative groups in the training set (Table 2). After backward stepwise multivariate logistic regression analysis, age < 55 years and US-reported LN status positive were still identified to be significant factors (p < 0.05) for LNM ( Table 3)   (c) features. (b,d) The 10-fold cross-validation and the minimum criteria process were used to generate the optimal penalization coefficient lambda (λ) in the BMUS and CEUS LASSO models. Dotted vertical lines are drawn by using the minimum criteria and 1 standard error of the minimum criteria. As a result, λ values of 0.08255607 and 0.05676918 were selected for the BMUS (b) and CEUS (d) features, respectively.

Model Building and Validation
The univariate results showed significant differences (p < 0.05) in age, gender, tumor size, and US-reported LN status between the LNM positive and negative groups in the training set (Table 2). After backward stepwise multivariate logistic regression analysis, age < 55 years and US-reported LN status positive were still identified to be significant factors (p < 0.05) for LNM ( Table 3    Six factors, namely, age, gender, tumor size, US-reported LN status, BMUS Radscore, and CEUS Radscore were introduced into stepwise multivariate logistic regression. As a result, the clinical-radiomics combined model was constructed based on gender, age, US-reported LN status, and CEUS Radscore (Table 3 Figure 4A). The calibration curves ( Figure 4B,C) and the Hosmer-Lemeshow test revealed that there was no significant difference between the probability predicted by the clinical-radiomics nomogram and actual probabilities (the Hosmer-Lemeshow test: p-value = 0.569 in the training set; p-value = 0.558 in the validation set). The DCA ( Figure 5) showed that a treatment plan based on the clinical-radiomics nomogram might be more beneficial than either the treat-all-patients strategy or the treat-none strategy, and the net benefit of the clinical-radiomics nomogram was higher than the clinical model across the majority of the range of threshold probabilities.
training and validation set, respectively (Figure 3). Then we compared the clinical model and the clinical-radiomics model [IDI = 15.42% (9.15%-21.69%), p < 0.001 in the training set; IDI = 8.59% (0.91%-16.26%), p = 0.028 in the validation set], a notable improvement in discrimination was seen in the clinical-radiomics model. This might mean that the addition of CEUS Radiomics improved LNM risk discrimination beyond the clinical model. We visualized the clinical-radiomics model using a clinical-radiomics nomogram ( Figure  4A). The calibration curves ( Figure 4B, C) and the Hosmer-Lemeshow test revealed that there was no significant difference between the probability predicted by the clinical-radiomics nomogram and actual probabilities (the Hosmer-Lemeshow test: p-value = 0.569 in the training set; p-value = 0.558 in the validation set). The DCA ( Figure 5) showed that a treatment plan based on the clinical-radiomics nomogram might be more beneficial than either the treat-all-patients strategy or the treat-none strategy, and the net benefit of the clinical-radiomics nomogram was higher than the clinical model across the majority of the range of threshold probabilities.

Discussion
Currently, surgery is the main treatment for PTC. However, it is controversial whether total thyroidectomy and prophylactic lymph node dissection can provide substantial benefits for patients with PTC. Furthermore, surgical demolitive interventions

Discussion
Currently, surgery is the main treatment for PTC. However, it is controversial whether total thyroidectomy and prophylactic lymph node dissection can provide substantial benefits for patients with PTC. Furthermore, surgical demolitive interventions may be accompanied by more severe complications, such as recurrent laryngeal nerve paralysis, cervical hematoma, and hypoparathyroidism [24,25]. Therefore, preoperatively prognostic markers to assess the risk of cervical LNM in PTC are of great significance to effectively avoid overdiagnosis and improve prognosis. To this end, Vincenzo Marotta et al. found that germline VEGF-A single nucleotide polymorphisms (SNPs) were stable and accessible prognostic markers for DTC (Differentiated Thyroid Cancer; PTC accounts for 85% of DTC [23]) obtained by peripheral blood testing, and constitute promising tools to enhance prognostic stratification of DTC [26]. In addition, Zhang et al. analyzed the BRAFV600E mutation from thyroid nodule samples collected by Fine-Needle Aspiration (FNA) biopsy and found that BRAFV600E mutation was an independent prognostic marker of central cervical LNM in PTC [27]. Unlike the aforementioned markers, radiomics is a non-invasive, time-saving, and cost-effective prognostic marker which was confirmed by many recent studies [13,[28][29][30].
In the current study, we developed and validated a clinical-radiomics nomogram that combines key clinical risk factors and CEUS radiomics features for the individualized prediction of LNM in PTC. Compared with the clinical model, the clinical-radiomics nomogram had the better diagnostic efficacy for predicting LNM, with AUCs of 0.820 and 0.814, in the training set and validation set, respectively. Thus, our study suggests that clinical-radiomics nomogram can be used to assess the risk of LNM for PTC patients preoperatively and non-invasively, and provide a reference for individualized treatment planning.
According to the TNM staging system of the AJCC 8th edition [23], 55 years was used as the cut-off value for the age of patients with PTC in this study. Both univariate and multivariate analyses showed that age was significantly and negatively associated with LNM, and young age was an independent risk factor for cervical LNM in patients with PTC. This is consistent with previous findings [11,31]. Therefore, during the preoperative US examination, it was crucial to carefully examine the LN status in young patients with PTC. Male PTC patients had a higher likelihood of LNM than female PTC patients [32][33][34], which was also confirmed by our findings. This may be related to sex hormones [35]. After selection with the backward stepwise method, gender was kept in the final model. Both univariate and multivariate analyses showed that US-reported LN status was significantly associated with LNM, so US-reported LN status was included in the final prediction model. This is similar to other studies [18,19], showing that US-reported LN status is an important part of the preoperative prediction. However, US has limitations in the assessment of cervical LNM. Central cervical LNM was easily missed due to its deep location and the thyroid gland that overlies it; some meta-analyses have shown that the sensitivity of ultrasound for the assessment of central cervical LNM is less than 35% [7,36]. In addition, US diagnosis is based on visual qualitative judgments that were constrained by US physicians' experience differences. Therefore, complementary indicators are needed for a more efficient diagnosis.
In recent years, radiomics is one of the hot research topics in medical imaging. Radiomics analysis can overcome the possible strong subjectivity of traditional medical image interpretation and convert medical imaging data into quantitative biomarkers through innovative computational methods [37]. US-based radiomics techniques have developed rapidly and have been applied to the differential diagnosis of tumors and the assessment of tumor aggressiveness, including malignant parotid gland lesions [38], breast cancer [39,40], and renal cell carcinoma [41]. With regard to PTC, Enock Adjei Agyekum et al. [42] reported that the radiomics model based on preoperative US images provided promising results in assessing cervical LNM in patients with PTC. Some studies have explored the further application of multimodal US radiomics [19,43,44]. Our preliminary study found that radiomics analysis of BMUS and CEUS images has good diagnostic efficacy for dis-criminating thyroid nodules, and the diagnostic efficacy of the BMUS + CEUS radiomics model is superior, suggesting the potential application value of multimodal radiomics in identifying benign and malignant thyroid nodules [45]. CEUS can help for preoperative prediction of LNM in PTC by intravenously injecting a blood pool contrast agent to show tissue microperfusion [5]. In this study, we explored the use of combined BMUS and CEUS images for radiomics analysis in predicting LNM of PTC. It is noticeable that in univariate analysis, BMUS Radscore and CEUS Radscore were significantly associated with LNM, but BMUS Radscore did not enter into the final clinical-radiomics model, which is similar to the results of Jiang et al. [19], showing that the BMUS Radscore was excluded due to its insufficient predictive power for LNM. We discovered that in the final multivariate stepwise logistic regression, the superior discriminatory power of CEUS Radscore weakened the weight of the BMUS Radscore.
Previous studies have found that the CEUS enhancement pattern of tumors can help predict LNM and that hyper-or iso-enhancement can be an independent risk factor for LNM [12,46]. The enhancement pattern of CEUS in our study is not related to LNM, probably due to the different data set with the small sample size of high enhancement in our study; in addition, enhancement intensity, as a qualitative characteristic judged by the naked eye, involves a certain degree of subjectivity. This does not affect our inspiring finding that quantitative radiomics analysis based on the CEUS image is strongly associated with LNM. In contrast to visual inspection of enhancement intensity and homogeneity, radiomics may be able to quantitatively decode important information about the heterogeneity of tumor microcirculation, which is associated with intratumoral perfusion, vascular permeability, and angiogenesis [47][48][49]. The CEUS Radscore included five radiomics features, and 80% of the selected radiomics features were wavelet-based features. The wavelet transform can reveal the hidden features of medical images at multiple scales [50,51], amplify the heterogeneous information of target tumor texture features, and enhance the discriminative ability [52]. We also noticed that most selected CEUS radiomics features characterize the spatial distribution of lesion voxels, proving that PTCs with higher vascular heterogeneity are prone to exhibit aggressive biological behavior. These radiomics features, which are hard to identify with the naked eye, have the potential to be non-invasive biomarkers for the preoperative prediction of cervical LN status in PTC.
Finally, we combined the radiomics score and key preoperative clinical features to create a clinical-radiomics model, and for clinical application, a nomogram was created as a visualization of the logistic regression model. The AUC and clinical benefit of the clinicalradiomics nomogram were higher than using the clinical model. Combining key clinical features with CEUS Radscore resulted in a significant improvement in IDI, demonstrating the incremental value of CEUS-based radiomics for preoperative clinical prediction of LN status.
The limitations of our study should be acknowledged. 1: This study was a retrospective single-center study, so a prospective multicenter study with a large sample size is needed for further improvement before practical application. 2: Our radiomics analysis of the CEUS image was based on a single-frame image due to technological limitations, and much information might be missed compared to the analysis of the entire perfusion process. Further research is needed into image processing and feature extraction of the dynamic image. 3: The radiomics analysis in this study was based on the images of the primary tumors, and there are still few studies that establish a radiomics model based on LN sonograms for LNM prediction in PTC patients. Future research is needed to determine the feasibility and predictive value of radiomics analysis based on LN sonograms or a combination of the primary tumor and LN images.

Conclusions
In conclusion, we constructed a clinical-radiomics nomogram incorporating CEUS Radscore and key clinical features. It demonstrated favorable predictive ability for LNM in patients with PTC and can be used as an effective tool for individualized prediction of cervical LNM in PTC.