Radiomics Analysis on Gadoxetate Disodium-Enhanced MRI Predicts Response to Transarterial Embolization in Patients with HCC

Objectives: To explore the potential of radiomics on gadoxetate disodium-enhanced MRI for predicting hepatocellular carcinoma (HCC) response after transarterial embolization (TAE). Methods: This retrospective study included cirrhotic patients treated with TAE for unifocal HCC naïve to treatments. Each patient underwent gadoxetate disodium-enhanced MRI. Radiomics analysis was performed by segmenting the lesions on portal venous (PVP), 3-min transitional, and 20-min hepatobiliary (HBP) phases. Clinical data, laboratory variables, and qualitative features based on LI-RADSv2018 were assessed. Reference standard was based on mRECIST response criteria. Two different radiomics models were constructed, a statistical model based on logistic regression with elastic net penalty (model 1) and a computational model based on a hybrid descriptive-inferential feature extraction method (model 2). Areas under the ROC curves (AUC) were calculated. Results: The final population included 51 patients with HCC (median size 20 mm). Complete and objective responses were obtained in 14 (27.4%) and 29 (56.9%) patients, respectively. Model 1 showed the highest performance on PVP for predicting objective response with an AUC of 0.733, sensitivity of 100%, and specificity of 40.0% in the test set. Model 2 demonstrated similar performances on PVP and HBP for predicting objective response, with an AUC of 0.791, sensitivity of 71.3%, specificity of 61.7% on PVP, and AUC of 0.790, sensitivity of 58.8%, and specificity of 90.1% on HBP. Conclusions: Radiomics models based on gadoxetate disodium-enhanced MRI can achieve good performance for predicting response of HCCs treated with TAE.


Introduction
Hepatocellular carcinoma (HCC) accounts for about 90% of all primary liver malignancies in cirrhotic patients [1]. Treatment options for HCC are based on tumor extent and severity of chronic liver disease [1]. In patients with localized HCC, ineligible for resection

Materials and Methods
This retrospective, single-institution study protocol was approved by the Ethical Committee of University Hospital of Palermo with a waiver for informed consent (N. 10/2020 Approval Date: 25 November 2020).

Population
A search through the clinical and radiological database at our Institution was conducted to select adult patients meeting the following inclusion criteria: (1) diagnosis of cirrhosis or chronic hepatitis B infection; (2) presence of unifocal HCC naïve to any prior treatment, in patients not candidate to surgical resection due to advanced liver disease; (3) gadoxetate disodium-enhanced MRI performed at our Institution; (4) underwent successful transarterial embolization between 2015 and 2020, within one month of the pretreatment gadoxetate disodium-enhanced MRI. Patients were excluded if they met any of the following criteria: (1) lack of pretreatment contrast-enhanced MRI exams (n = 56); (2) pretreatment contrast-enhanced MRI not acquired with gadoxetate disodium as contrast agent (n = 14). The flowchart of patients' accrual for this study is illustrated in Figure 1. The final study population consisted of 51 patients (37 males, 14 females, median age 73 years, range 44-85 years) with unifocal HCC and available pretreatment gadoxetate disodium-enhanced MRI. Patient-related clinical data and laboratory variables-i.e., age at the time of treatment, sex, laboratory and tumor markers, history of ascites or varices, and Child-Pugh score-were collected using the electronic data repository systems.

MRI Technique
MRI exams were acquired on two clinical 1.5-T MR scanners (n = 27 patients with Signa Excite, General Electric, Healthcare, Milwaukee, WI, USA; and n = 24 patients with Intera Achieva 1.5 Philips Healthcare, Best, The Netherlands) equipped with a 16-channel body phased-array coil. All patients underwent contrast-enhanced MRI with a dedicated liver protocol, in accordance with LI-RADS v2018 technical recommendations [23] which include the following sequences: axial T2-weighted turbo or fast spin-echo (with and without fat saturation) sequences, axial dual gradient-recalled echo T1-weighted sequence (in-phase and opposed-phase), and axial diffusion weighted imaging acquired with b values of 0, 150 and 800 s/mm 2 . Axial T1-weighted three-dimensional gradient-recalled echo sequences with fat suppression (Liver Acquisition with Volume Acceleration, LAVA, General Electric; or T1-weighted high-resolution isotropic volume examination, THRIVE, Philips Healthcare) were obtained before and after contrast agent administration. Detailed parameters of T1-weighted three-dimensional gradient-recalled echo sequences are reported in Supplementary Table S1. A weight-based dose of 0.025 mmol/Kg of gadoxetate disodium (Gd-EOB-DTPA, Primovist, Bayer Healthcare, Berlin, Germany) was injected at 1 mL/sec, followed by 20-mL saline flush at the same injection rate, using an automatic injector (Medrad ® Spectris Solaris ® EP, Bayer Healthcare, Berlin, Germany). Post-contrast phases were acquired on late hepatic arterial (12-14 s after the detection of contrast bolus), portal venous (PVP, 50-60 s after the detection of contrast bolus), transitional (at 3 min and 5 min), and hepatobiliary (HBP, 20 min) phases.

MRI Qualitative Analysis
Two radiologists (F.V. and R.C., with seven and six years of experience in liver imaging), blinded to the lesions' outcome, reviewed all the gadoxetate disodium-enhanced MRI exams in consensus using the LI-RADS v2018 diagnostic algorithm [24]. The radiologists recorded the presence of major features including size, nonrim arterial phase hyperenhancement (nonrim APHE), nonperipheral "washout" (evaluated on PVP only due to the injection of gadoxetate disodium), enhancing "capsule", threshold growth, and assigned a final category based on LI-RADS diagnostic table [24]. Additionally, ancillary features favoring malignancy (not HCC in particular or HCC in particular), and ancillary features favoring benignity were recorded as defined by LI-RADS algorithm [24]. Notably, due to the very low frequency of ancillary features favoring benignity in our cohort (only HBP isointensity detected in one observation) they were not included for further analysis.

Segmentation and Radiomics Feature Extraction
Lesion segmentation was performed using the research software Radiomics, version 1.2.2 (Siemens Healthineers, Forchheim, Germany), by the same radiologists in consensus, with an interval time of one month from qualitative analysis to avoid recall biases. Post-contrast axial T1-weighted three-dimensional gradient-recalled echo sequences acquired on PVP, 3 transitional, and HBP were used for lesion segmentation and radiomics feature extraction. T2-weighed and diffusion weighted imaging were already assessed in prior investigations [19,20]. The late hepatic arterial phase was not included due to the known possible respiratory-motion artifacts in the images acquired after the injection of gadoxetate disodium compared to other gadolinium-based contrast agents, and to limit the confounding factors associated with the contrast injection rate [25,26]. Selected sequences were anonymized and sent to a dedicated workstation equipped with a research radiomics software (Radiomics, version 1.2.2; Siemens Healthineers, Forchheim, Germany) [27]. A region of interest was manually drawn within the lesion margins, including the whole tumor visualized in consecutive slices ( Figure 2). The radiomic tool accesses the PyRadiomics package (Version 1.3.0) implemented in Python which calculates 854 radiomics features categorized into three main categories (i.e., intensity, shape, and texture radiomics features), including 18 first order histogram-based features, 24 gray-level co-occurrence matrix (GLCM), 14 gray level dependence matrix (GLDM), 16 grey-level run-length matrix (GLRLM), 16 greylevel zone length matrix (GLZLM), 5 neighboring gray tone difference matrix (NGTDM), 17 shape-based, and 743 wavelet-based radiomics features. A detailed description of the features can be found in the online documentation of PyRadiomics (https://pyradiomics.readthedocs.io/en/latest/features.html, accessed on 20 January 2022).

Lesions Outcome
Evaluation of treatment response at contrast-enhanced exams performed at one month after treatment was used as reference standard for this study. Treatment response was evaluated using the modified RECIST criteria (mRECIST) [28]. Patients were classified as complete response (disappearance of any intratumoral arterial enhancement), partial response (≥30% decrease of intratumoral enhancement of target lesion), stable disease (neither partial response or progression disease), and progressive disease (≥20% increase in size of target lesion) [28]. Patients with complete response and partial response were considered as objective response.

Statistical Analyses of Qualitative Features
Categorical variables were expressed as numbers and percentages. Continuous variables were reported using median and interquartile ranges (IQR) according to the Shapiro-Wilk normality test. Analysis of clinical variables and LI-RADS qualitative imaging features was conducted using the Pearson χ 2 or Fisher exact test to assess differences between categorical variables, and Mann-Whitney U test to assess differences between continuous variables. A p value < 0.05 was considered for statistical significance. Statistical analyses were performed using SPSS Software (Version 20.0. Armonk, NY, USA: IBM Corp.).

Radiomics Models Construction
Two different radiomics models were built, including a statistical model based on logistic regression with elastic net penalty (model 1) and a computational model based on an innovative hybrid descriptive-inferential feature extraction method that combines the point biserial correlation and the logistic regression (model 2) [29,30]. Each model was tested for the prediction of complete response and objective response in the analyzed phases (PVP, 3 transitional, and HBP). The models were constructed including the dataset of 854 radiomics features, the combined dataset of 12 clinical variables, and 19 LI-RADS qualitative imaging features. Predictive performances of the models were summarized by using receiver operating characteristics (ROC), areas under the ROC curve (AUC) with 95% confidence interval (95% CI), sensitivity, specificity, and accuracy.

Radiomics Model 1
For the logistic regression model, the dataset on each phase was randomly split into training set (80%) and test set (20%). For features selection, a logistic model with elastic net (e-net) penalty was fit on the training set. The tuning hyper-parameters alpha and lambda were found in the following way: a grid search of alpha values is carried out in the interval [0.5, 1] and, for each fixed alpha, the best lambda is found by a 5-fold cross-validation. The e-net is preferred to the LASSO in order to reduce the chance of selecting zero features. The interval [0.5, 1] is chosen as in this step we are more interested in feature selection than decorrelating the features.
For features decorrelation, two reduced training set and test set were created on the ground of the selected features. A logistic regression model with ridge penalty [31] was fit, while lambda is newly estimated by 5-fold cross-validation.
Statistical analyses were performed by a dedicated statistician (M.E. with 16 years of experience) by using the R packages glmnet [32] and islasso [33] at steps 1-2, respectively.

Radiomics Model 2
For the computational model, the point-biserial correlation coefficient (rpb) was calculated between each feature and the corresponding outcome in order to obtain the most discriminative features, avoiding overfitting problems and repetition of multiple features with high similarity or with incidental statistical significance [29,30]. The absolute value of rpb was used to sort the radiomics features in descending order. Next, a cycle was started to add one column at a time performing a logistic regression analysis. The p value of the current iteration was compared with the previous iteration. If this did not decrease, the cycle was interrupted. In this way, the features with valuable association with the outcome were identified. Subsequently, the Discriminant Analysis (DA) was used as a method for features classification [34] in order to overcome the unbalanced dataset issue which occurred in the case of complete response where 14 patients were compared with 37 patients (see Section 3.2). As reported in prior studies [35,36]. unbalanced datasets do not have a negative effect on DA performance. Furthermore, Model 2 was tested in a k-fold cross-validation fashion grouping data into training and test sets maintaining the same outcome percentage of the original dataset to preserve any data imbalances. In addition, the "k-fold" strategy was used in such a way as to guarantee disjointed test sets. Thus, the dataset was divided into equal k = 5 subsets, and the holdout method was repeated 5 times. In other words, each time a different fold never used during the training was left for test. Then, the performance of each k-model was averaged to make the results more robust, and to avoid optimistic results. The k = 5 was empirically determined through the trial-and-error method (k range: 5-15, step size of 5).
Computational analysis of radiomics features was conducted by a computer engineer (A.C., with 9 years of experience on computational data analysis).

Clinical and Qualitative Imaging Analysis
Differences in clinical and laboratory variables according to the response assessment are reported in Table 1. Only platelet count resulted significantly higher in patients with complete response compared to patients lacking complete response (median 115.1 × 10 3 /µL vs. 80.0 × 10 3 /µL; p = 0.021). No statistically significant differences were observed in patients with or without objective response. There were no statistically significant differences in frequency of LI-RADS major features and ancillary features favoring malignancy in HCCs according to treatment response ( Table 2).

Radiomics Model 1
The most discriminative features selected from clinical variables, LI-RADS qualitative imaging features, and radiomics-based features through the logistic regression with elastic net penalty are reported in Supplementary Table S2.
Performances of the statistical models tested by using the logistic regression are reported in Table 3 for the training set, and Table 4 for the test set. Corresponding ROC curves are illustrated in Figure 3 for the training set, and in Figure 4 for the test set. Model 1 on HBP showed the highest performance in the test set for predicting complete response, with an AUC of 1.000 (95% CI 1.000-1.000, p < 0.001), sensitivity of 100%, and specificity of 100%. Model 1 on PVP showed the highest performance in the test set for predicting objective response, with an AUC of 0.733 (95% CI 0.405-1.000, p = 0.163), sensitivity of 100%, and specificity of 40.0%.

Radiomics Model 2
The most discriminating features, selected from clinical variables, LI-RADS qualitative imaging features, and quantitative imaging-based features by the hybrid descriptiveinferential feature extraction method, are reported in Supplementary Table S3.
Performances of the models tested by using the discriminant analysis are shown in Table 5. Corresponding ROC curves are illustrated in Figure 5. The model based on the HBP showed the highest diagnostic performance for predicting complete response, with an AUC of 0.861 (95% CI 0.737-0.984, p = 0.010), sensitivity of 75.5%, and specificity of 82.8%. Models on PVP and HBP demonstrated similar performances for predicting objective response, with an AUC of 0.791 (95% CI 0.667-0.915, p = 0.002), sensitivity of 71.3%, and specificity of 61.7% on PVP, and AUC of 0.790 (95% CI 0.649-0.931, p = 0.031), sensitivity of 58.8%, and specificity of 90.1% on HBP.

Discussion
Our study tested the performance of statistical and computational models based on the combination of clinical data, qualitative LI-RADS-defined imaging features, and radiomics features extracted from PVP, 3 transitional phase, and HBP for the prediction of response after TAE in patients with unifocal HCC. Our results demonstrate that both models constructed through statistical and computational analyses had an almost perfect performance for the prediction of complete response using HBP images. For the prediction of objective response, the performances were higher on PVP and HBP when using the computational models, while performance of statistical-based models were lower in the test set. Notably, both models were mostly based on the selection of wavelet radiomics features, with most of the selected features being gray-level co-occurrence matrix. The differences in the final diagnostic performances may be related to the smaller number of patients included in the test set when testing the statistical models, especially when assessing complete response. Due to the large number of included features, the pointbiserial correlation coefficient was adopted in the computational model to select the most discriminative features and to avoid overfitting problems. Importantly, only platelet count was significantly different between patients with complete response compared to nonresponders, and none of the LI-RADS-based qualitative imaging features could predict response in our cohort.
Nowadays, there are no robust predictors of optimal response after transarterial treatments in patients with HCC. Optimal patients' selection for transarterial treatments is a key factor for prolonged survival outcomes [37]. Predictive models using only conventional clinical, biochemical, and radiological variables are not able to provide a reliable prediction of the complete radiological response in clinical practice [3,4]. This is concordant with our results, in which only platelet count was retained in the final combined models for the prediction of complete response. In clinical practice, indications for transarterial treatments are mostly based on HCC tumor stage, technical feasibility, and compensated hepatic function [37]. Nevertheless, intermediate stage HCC includes a wide spectrum of lesions with heterogeneous biological aggressiveness that cannot be anticipated by conventional radiological features. Radiomics has the potential to provide additional quantitative data that can be correlated with tumor biological aggressiveness [7]. Our study supports the applications of radiomics on gadoxetate disodium MRI for the assessment of patients before intraarterial treatments. The integration of radiomics features could be particularly relevant in clinical practice for selecting the optimal candidates to intraarterial treatments, while patients with radiomics features suggestive of poor response could benefit of early switching to alternative treatments, including the most recent systemic therapies with combination of immunotherapy and antiangiogenic drugs [38].
Few prior studies explored the value of MRI-based radiomics combined with clinical features for the prediction of tumor response after transarterial treatments [18][19][20][21][22]. Song et al. [18] built a nomogram for prediction of recurrence-free survival after TACE based on the combination of radiomics features of the entire tumor on PVP with extracellular contrast agent and clinical variables. Zhao et al. [21] presented a combined clinical-radiomics model for the prediction of objective response after TACE. In that study, the radiomics model on PVP on MRI with extracellular contrast agent showed the highest diagnostic performance (AUC of 0.830) in the validation cohort [21]. Kuang et al. [22] incorporated radiomics features on T2-weighted and arterial phase images in a nomogram to predict short-term response after TACE. Interestingly, similarly to our study, low platelet count was associated with poor response after treatment [22]. Our study proved that radiomics analysis may be used to predict response also on gadoxetate disodium-enhanced MRI. In our study the performance of final models including radiomics features was the highest on HBP images, with a sensitivity of 75.5% and specificity of 82.8% for the prediction of complete response, and a sensitivity of 58.8%, and specificity of 90.1% for the prediction of objective response, according to the computational analysis results and this is in line with prior literature demonstrating the role of HBP for HCC diagnosis [39][40][41]. Most importantly, no prior investigations compared the accuracy of radiomics-models against LI-RADS-based features. Interestingly, LI-RADS features seem to optimally predict recurrence-free survival and overall survival after curative surgical resection in patients with primary liver carcinomas, while there is scarce data on the predictive value of LI-RADS features in the context of locoregional treatments [42,43].
This single-center retrospective study has multiple limitations that need to be acknowledged. First of all, the study population was small, and the inclusion of only patients with single treatment-naïve HCC and available pre-treatment gadoxetate disodium-enhanced MRI acquired with the same protocol may have introduced selection biases. Further studies need to evaluate the performance of radiomics models in large multicentric cohorts before direct translation into clinical practice. Second, outcomes were based on response assessment using mRECIST criteria at one month, which is a moderate radiological surrogate of overall survival in patients undergoing transarterial treatments [44]. However, this remains the most widely used endpoint for treatment decisions. Third, not all the lesions were proven by pathology. The diagnosis of most HCCs treated in our study was made by categorization of observations as LR-5 using LI-RADSv2018, which means a 95% pooled proportion of lesions proven to be HCC at histopathological analysis [45]. Fourth, images were acquired using 1.5T MR scanners in our study; therefore, these results cannot be generalized to 3T MR scanners. We also did not evaluate T2-weighted sequences and diffusion weighted imaging, which were extensively investigated in prior studies [19,20]. Finally, HCC segmentation was performed manually. Although manual segmentation is still considered the reference standard for radiomics analysis, this is a time-consuming process that may be prone to inter-reader variability.

Conclusions
Radiomics models based on gadoxetate disodium-enhanced MRI can achieve a good performance for the prediction of response in HCCs treated with TAE. The performance of radiomics-based models for predicting objective response is the highest when evaluating the portal venous phase and hepatobiliary phase images.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/diagnostics12061308/s1, Supplementary Table S1: Acquisition parameters of post-contrast axial T1-weighted three-dimensional gradient-recalled echo sequences with fat suppression at 1.5T MRI scanners; Supplementary Table S2: Most discriminant features identified by using a first features selection performed identified by the statistical model by a logistic regression with elastic net penalty (radiomics model 1), while the final model was a logistic model with ridge penalty on the selected features in order to find an uncorrelated solution of clinical variables for reduction and selection of clinical variables, LI-RADS qualitative features, and radiomics features according to the post-treatment response; Supplementary Table S3 Informed Consent Statement: Patient consent was waived by the Ethical committee due to retrospective non-harmful nature of the study according to the 1964 Helsinki declaration and its later amendments or comparable ethical standards.
Data Availability Statement: Data will be made available by corresponding authors upon reasonable request.