EOB-MR Based Radiomics Analysis to Assess Clinical Outcomes following Liver Resection in Colorectal Liver Metastases

Simple Summary The aim of this study was to assess the efficacy of radiomics features obtained by EOB-MRI phase in order to predict clinical outcomes following liver resection in Colorectal Liver Metastases Patients, and evaluate recurrence, mutational status, pathological characteristic (mucinous) and surgical resection margin. Ours results confirmed the capacity of radiomics to identify, as biomarkers, several prognostic features that could affect the treatment choice in patients with liver metastases, in order to obtain a more personalized approach. These results were confirmed by external validation dataset. We obtained a good performance considering the single textural significant metric in the identification of front of tumor growth (expansive versus infiltrative) and tumor budding (high grade versus low grade or absent), in the recognition of mucinous type and in the detection of recurrences. Abstract The aim of this study was to assess the efficacy of radiomics features obtained by EOB-MRI phase in order to predict clinical outcomes following liver resection in Colorectal Liver Metastases Patients, and evaluate recurrence, mutational status, pathological characteristic (mucinous) and surgical resection margin. This retrospective analysis was approved by the local Ethical Committee board of National Cancer of Naples, IRCCS “Fondazione Pascale”. Radiological databases were interrogated from January 2018 to May 2021 in order to select patients with liver metastases with pathological proof and EOB-MRI study in pre-surgical setting. The cohort of patients included a training set (51 patients with 61 years of median age and 121 liver metastases) and an external validation set (30 patients with single lesion with 60 years of median age). For each segmented volume of interest by 2 expert radiologists, 851 radiomics features were extracted as median values using PyRadiomics. non-parametric test, intraclass correlation, receiver operating characteristic (ROC) analysis, linear regression modelling and pattern recognition methods (support vector machine (SVM), k-nearest neighbors (KNN), artificial neural network (NNET), and decision tree (DT)) were considered. The best predictor to discriminate expansive versus infiltrative front of tumor growth was HLH_glcm_MaximumProbability extraxted on VIBE_FA30 with an accuracy of 84%, a sensitivity of 83%, and a specificity of 82%. The best predictor to discriminate tumor budding was Inverse Variance obtained by the original GLCM matrix extraxted on VIBE_FA30 with an accuracy of 89%, a sensitivity of 96% and a specificity of 65%. The best predictor to differentiate the mucinous type of tumor was the HHL_glszm_ZoneVariance extraxted on VIBE_FA30 with an accuracy of 85%, a sensitivity of 46% and a specificity of 95%. The best predictor to identify tumor recurrence was the LHL_glcm_Correlation extraxted on VIBE_FA30 with an accuracy of 86%, a sensitivity of 52% and a specificity of 97%. The best linear regression model was obtained in the identification of the tumor growth front considering the height textural significant metrics by VIBE_FA10 (an accuracy of 89%; sensitivity of 93% and a specificity of 82%). Considering significant texture metrics tested with pattern recognition approaches, the best performance for each outcome was reached by a KNN in the identification of recurrence with the 3 textural significant features extracted by VIBE_FA10 (AUC of 91%, an accuracy of 93%; sensitivity of 99% and a specificity of 77%). Ours results confirmed the capacity of radiomics to identify as biomarkers, several prognostic features that could affect the treatment choice in patients with liver metastases, in order to obtain a more personalized approach.


Introduction
Radiomics is a rapidly evolving field of research concerned with the extraction of quantitative metrics-the so-called radiomics features-within medical images. Radiomic features capture tissue and lesion characteristics such as heterogeneity and shape and may, alone or in combination with demographic, histologic, genomic, or proteomic data, be used for clinical problem solving. In oncology, the assessment of tissue heterogeneity is of particular interest; genomic analyses have demonstrated that the degree of tumor heterogeneity is a prognostic determinant of survival and an obstacle to cancer control. Studies have demonstrated that radiomics features are strongly correlated with heterogeneity indices at the cellular level [1][2][3][4][5][6][7][8]. Therefore, that Radiomics could support cancer detection, diagnosis, evaluation of prognosis and response to treatment, so as could supervise disease status [9][10][11][12][13][14]. Using standard of care images that are usually obtained in a clinical setting, Radiomics analysis is a cost-effective and highly feasible implement for clinical decision support, providing prognostic and/or predictive biomarkers which enables a fast, low-cost, and repeatable tool for longitudinal monitoring [15][16][17][18][19][20]. Even though individual features may correlate with genomic data, so-called radiogenomics, or clinical outcomes, the impact of radiomics is increased when the data are processed using machine learning techniques. Nowadays, several studies have assessed the role of radiogenomics in hepatocellular carcinoma, but only a few have examined liver metastases [1][2][3].
During the work-up of patients with liver metastases, imaging plays an important role, since it enables one to estimate the number and sites of lesions, to assess the resectability, and to evaluate the response to treatment and drug toxicities [21][22][23][24][25]. Though computed tomography (CT) is routinely used for primary staging and disease surveillance, Magnetic Resonance imaging (MRI) is a valuable diagnostic technique in oncologic settings since it allows one to assess morphological and functional data [21][22][23][24]. Moreover, several liver-specific contrast agents have been introduced to improve lesions detection and characterization. Gadobenate dimeglumine (Gd-BOPTA) and gadolinium ethoxybenzyl diethylenetriamine pentaacetic acid (Gd-EOB-DTPA) allow one to obtain data on the lesions vascularization during the different phases of contrast study and functional data in the delayed, hepatobiliary phase (EOB-phase).
In this context, the possibility to correlate radiomics parameters obtained by MRI studies to recurrences, mutational status, pathological characteristic (mucinous and tumor budding), and surgical resection margin offers notable advantages over qualitative imaging assessment, allowing a better patient selection for cancer therapy, treatment response prediction, and discrimination of favorable subsets of patients from those with poor prognosis. In the present study, we assessed the efficacy of radiomics features obtained by EOB-MRI phase to predict clinical outcomes following liver resection in Colorectal Liver Metastases Patients.

Dataset Characteristics
This study aligned with National appropriate guidelines and procedures. The National Cancer Institute of Naples Ethical Committee board approved this retrospective study, renouncing the need for informed patient consent given the study nature.
Radiological archive was evaluated from January 2018 to May 2021 in order to choose patients with: (1) liver metastases with pathological proof; (2) EOB-MRI study in presurgical setting after neoadjuvant chemotherapy; (3) MR images of high quality; (4) a followup CT scan of at least six months after surgery. The exclusion criteria were: (1) discordance among the imaging diagnosis and the pathologically ones, (2) no EOB-MRI phase studies; (3) no high-quality MR images.
The cohort of patients included a training set and an external validation set. The internal training set consisted of 51 patients (33 women and 18 men) with a median age of 61 years (range 35-82 years) and 121 liver metastases. The validation cohort consisted of 30 patients with single lesion (10 women and 20 men) with 60 years of median age (range 40-78 years). The external validation patient dataset was provided by "Careggi Hospital", Florence, Italy.
No liver metastases identified by EOB-MRI during the period investigated in this retrospective study were pathologically confirmed and thus not included in the study.
The patient characteristics are summarized in Table 1.

MR Imaging Protocol
MR studies were performed with 2 1.5T MR tomographs: a Magnetom Symphony (Siemens, Erlangen, Germany) and Magnetom Aera (Siemens). The MRI images were acquired before and after an intravenous (IV) contrast agent (CA) injection.
In this study, a radiomics features extraction was made on volumetric interpolated breath-hold examination (VIBE) T1-weighted SPAIR with controlled respiration used to acquire images after IV CA injection with a liver-specific CA (0.1 mL/kg of Gd-EOB-BPTA-Primovist, Bayer Schering Pharma, Berlin, Germany). The VIBE T1-W sequence was acquired with 2 different flip angles (10 and 30 degrees). A power injector (Spectris Solaris ® EP MR, MEDRAD Inc., Indianola, IA, USA) was used to administrate the CA at an infusion rate of 2 mL/s, as descripted in our previous studies [26,27]. Table 2 reports MR Sequence parameters.

Image Processing
Regions of interest (ROIs) were manually drawn slice-by-slice by 2 expert radiologists with 22 and 15 years of abdominal imaging experience, respectively; first separately and then together and in accordance with each other, annotating all of the slices of the lesions. The segmentation was performed on arterial phase of VIBE T1-W images for both sequences acquired using 10 and 30 degrees of flip angle. For these reasons, we performed the analysis on 2 sequence VIBE_FA10 (VIBE T1-W images with flip angle 10 • ) and VIBE_FA30 (VIBE T1-W images with flip angle 30 • ). Manual definition of the ROIs was made using segmentation tool of 3DSlicer (https://www.slicer.org/; accessed on 21 December 2021).

MRI Post-Processing with Pyradiomics Tool
For each volume of interest, 851 radiomics features were extracted as median values using open-source PyRadiomics python package [28].
We used wavelet filtering (LLH, LHL, LHH, LLL, HLL, HLH, HHL, HHH high (H) or low (L) -pass filters along the X and Y axis and a Z-axis) to six different matrices: The extracted features are in compliance with feature definitions as described by the Imaging Biomarker Standardization Initiative (IBSI) [29] and as reported in (https: //readthedocs.org/projects/pyradiomics/downloads/; accessed on 21 December 2021).
Median values of radiomics features were considered for each segmented volume of interest.
A graphical representation of the radiomics process and of the extracted features has been reported in Figure 1. However, Radiomics involves 3D qualitative and quantitative high throughput extraction of digital imaging data that cannot be represented as an image.

Statistical Analysis
Statistical analysis includes both univariate and multivariate approaches performed considering a per-lesion analysis. The statistical analyses were performed using the Statistics and Machine Toolbox of MATLAB R2021b (MathWorks, Natick, MA, USA).

Univariate Analysis
The assessment of observer variability was performed by calculating the intraclass correlation coefficient.
A non-parametric Kruskal-Wallis test was performed to identify differences statistically significant among clinical parameters and radiomics metrics of two groups (front of tumor growth: expansive versus infiltrative; tumor budding: high grade versus low grade or absent; mucinous type and presence of recurrence).
Receiver operating characteristic (ROC) analysis was performed using the Youden index to calculate the optimal cut-off for each metric and then the area under the ROC curve (AUC), sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV) and accuracy.
A p value < 0.05 was considered as significant.

Multivariate Analysis
A multivariate analysis was performed in order to identify the combinations of vari- Radiomics analysis was performed blinded to the clinical and pathological data in pre-surgical setting after neoadjuvant chemotherapy. No registration techniques to reduce movements artefacts were applied, however, the use of median value of extracted metrics allows one to reduce the influence of artefacts.

Statistical Analysis
Statistical analysis includes both univariate and multivariate approaches performed considering a per-lesion analysis. The statistical analyses were performed using the Statistics and Machine Toolbox of MATLAB R2021b (MathWorks, Natick, MA, USA).

Univariate Analysis
The assessment of observer variability was performed by calculating the intraclass correlation coefficient.
A non-parametric Kruskal-Wallis test was performed to identify differences statistically significant among clinical parameters and radiomics metrics of two groups (front of tumor growth: expansive versus infiltrative; tumor budding: high grade versus low grade or absent; mucinous type and presence of recurrence).
Receiver operating characteristic (ROC) analysis was performed using the Youden index to calculate the optimal cut-off for each metric and then the area under the ROC curve (AUC), sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV) and accuracy.
A p value < 0.05 was considered as significant.

Multivariate Analysis
A multivariate analysis was performed in order to identify the combinations of variables which best predict the outcomes: (1) front of tumor growth: expansive versus infiltrative; (2) tumor budding: high grade versus low grade or absent; (3) mucinous type; (4) presence of recurrence.
Given the high number of textural features, a first selection of variables was made based on the results obtained from the univariate analysis: significant at nonparametric Kruskal-Wallis test and with an accuracy ≥ 75%. A linear regression modelling was used to assess the best linear combination of significant textural features for each outcome. The linear regression model was used to assess the accuracy of linear combination and ROC analysis with Youden index was used to identify the optimal cut-off value. Considering the optimal cut-off value, we reported accuracy, sensitivity, specificity, PPV and NPV.
Pattern recognition techniques including support vector machine (SVM), k-nearest neighbors (KNN), artificial neural network (NNET), and decision tree (DT) were performed to calculate the diagnostic performance considering all of the features and/or a subset of features after a feature selection approach [30]. The best model was identified calculating the highest area under ROC curve and highest accuracy. Each classifier was trained with a 10-k fold cross validation; therefore, median values of AUC, accuracy, sensitivity, and specificity were calculated. Moreover, an external validation cohort was used to validate the findings of the best classifier.

Univariate Analysis Findings
The median value of intraclass correlation coefficients for features was 0.92 (range 0.86-0.96). The size of the lesion did not affect the extracted metrics (p-value > 0.05 at the Kruskal-Wallis test performed between the 2 groups: lesions < 2 cm and lesions ≥ 2 cm). In addition, the RAS mutational status did not affect the extracted metrics (p-value > 0.05 at the Kruskal-Wallis test performed between the groups). Therefore, considering homogeneous the two groups respect to the extracted radiomics metrics, RAS mutational was not considered for the following analysis.
There were no differences between the extracted radiomics metrics on VIBE_FA10 and on VIBE_FA30 (p-value > 0.05 at the Kruskal-Wallis test).
Among the significant features to differentiate the front of tumor growth on VIBE_FA10, 8 textural parameters obtained an accuracy ≥ 75%. Among these 8 features, the best predictor to discriminate expansive versus infiltrative front of tumor growth was HHL_glcm_ MaximumProbability with an accuracy of 81%, a sensitivity of 92%, a specificity of 62%, a PPV and a NPV of 80% and 82%, respectively.
Among the significant features to differentiate the front of tumor growth on VIBE_FA30, 15 textural parameters obtained an accuracy ≥ 75%. Among these 15 features, the best predictor to discriminate expansive versus infiltrative front of tumor growth was HLH_glcm_ MaximumProbability (the same feature of previous case obtained with another wavelet filter HLH respect to HHL) with an accuracy of 84%, a sensitivity of 83%, a specificity of 82%, a PPV and a NPV of 89% and 74%, respectively. Significant radiomics metrics for each outcome at univariate analysis are reported in Table 3. Among the significant features to differentiate the tumor budding on VIBE_FA10, 8 textural parameters obtained an accuracy ≥ 85%. Among these 8 features, the best predictor to discriminate tumor budding was again the HHL_glcm_MaximumProbability with an accuracy of 88%, a sensitivity of 94%, a specificity of 68%, a PPV and a NPV of 89% and 81%, respectively.
Among the significant features to differentiate the tumor budding on VIBE_FA30, 16 textural parameters obtained an accuracy ≥ 85%. Among these 16 features, the best predictor to discriminate tumor budding was Inverse Variance obtained by the original GLCM matrix with an accuracy of 89%, a sensitivity of 96%, a specificity of 65%, a PPV and a NPV of 89% and 83%, respectively.
Among the significant features to differentiate the mucinous type of tumor on VIBE_FA10, 8 textural parameters obtained an accuracy ≥ 80%. Among these 8 features, the best predictor to differentiate the mucinous type of tumor was the HHH_ngtdm_Busyness with an accuracy of 84%, a sensitivity of 65%, a specificity of 42%, a PPV and a NPV of 69% and 86%, respectively.
Among the significant features to differentiate the mucinous type of tumor on VIBE_FA30, 12 textural parameters obtained an accuracy ≥ 80%. Among these 12 features, the best predictor to differentiate the mucinous type of tumor was the HHL_glszm_ZoneVariance with an accuracy of 85%, a sensitivity of 46%, a specificity of 95%, a PPV and a NPV of 71% and 87%, respectively.
Among the significant features to identify tumor recurrence on VIBE_FA10, 3 textural parameters obtained an accuracy ≥ 80%. Among these 3 features, the best predictor to identify tumor recurrence was the LLH_glrlm_ShortRunEmphasis with accuracy of 85%, a sensitivity of 31%, a specificity of 100%, a PPV and a NPV of 100% and 84%, respectively.
Among the significant features to identify tumor recurrence on VIBE_FA30, 8 textural parameters obtained an accuracy ≥ 80%. Among these 8 features, the best predictor to identify tumor recurrence was the LHL_glcm_Correlation with an accuracy of 86%, a sensitivity of 52%, a specificity of 97%, a PPV and a NPV of 84% and 85%, respectively.
In total, 26 features extracted by VIBE_FA10 were resulted significant at univariate analysis while 48 were resulted significand among those extracted on VIBE_FA30. Figure 2 shows a heat map.

Linear Regression Analysis Findings
Linear regression models obtained good results in each considered classification problem (1. Front of tumor growth: expansive versus infiltrative; 2. tumor budding: high grade versus low grade or absent; 3. mucinous type; 4. presence of recurrence) with an accuracy in the range of 72 to 89% Tables 4 and 5, Figures 3 and 4. The best linear regression model was obtained in the identification of the front of tumor growth considering the height textural significant metrics by VIBE_FA10 (AUC of 72%, an accuracy of 89%; sensitivity of 93% and a specificity of 82%). The coefficients of this linear models are reported in the Table 6.

Linear Regression Analysis Findings
Linear regression models obtained good results in each considered classification problem (1. Front of tumor growth: expansive versus infiltrative; 2. tumor budding: high grade versus low grade or absent; 3. mucinous type; 4. presence of recurrence) with an accuracy in the range of 72 to 89% (Tables 4 and 5

Pattern Recognition Approaches Findings
Considering the significant texture metrics tested with pattern recognition approaches, the best performance for each outcome (1. Front of tumor growth: expansive versus infiltrative; 2. tumor budding: high grade versus low grade or absent; 3. mucinous type and 4. presence of recurrence) was reached by a KNN, both considering the features extracted by VIBE_FA10 and VIBE_FA30. The accuracy was always major to 88% (Tables 4 and 5, Figures 5 and 6) both on training and validation set and the best results was obtained in the identification of recurrence with the 3 textural significant features extracted by VIBE_FA10 (AUC of 91%, an accuracy of 93%; sensitivity of 99% and a specificity of 77%).

Discussions
Ours results confirmed the capacity of radiomics to identify as biomarkers, several prognostic features that could affect the treatment choice in patients with liver metastases, in order to obtain a more personalized approach. In fact, the possibility to correlate radiomics parameters to RAS status offers notable advantages over qualitative imaging assessment, allowing one to tailor cancer therapy to the patient, to predict response to treatment, to distinguish favorable subsets of patients from those with poor prognosis, to select patients that may benefit of surgical treatment. Literature data underlines the role of several features, as RAS mutation, front of tumor growth, tumor budding and mucinous type as a strong prognostic and predictive biomarker in patients subjected to target therapy or

Discussions
Ours results confirmed the capacity of radiomics to identify as biomarkers, several prognostic features that could affect the treatment choice in patients with liver metastases, in order to obtain a more personalized approach. In fact, the possibility to correlate radiomics parameters to RAS status offers notable advantages over qualitative imaging assessment, allowing one to tailor cancer therapy to the patient, to predict response to treatment, to distinguish favorable subsets of patients from those with poor prognosis, to select patients that may benefit of surgical treatment. Literature data underlines the role of several features, as RAS mutation, front of tumor growth, tumor budding and mucinous type as a strong prognostic and predictive biomarker in patients subjected to target therapy or surgical resection. In this scenario, our results confirmed the possibility of radiomics to allow one to tailor cancer therapy at the patient, to predict response to treatment, to detect favorable subsets of patients from those with poor prognosis and to select patients that may benefit from surgical treatment [5,6].
Our results were confirmed by external validation dataset. We obtained a good performance considering the single textural significant metric in the identification of front of tumor growth (expansive versus infiltrative) and tumor budding (high grade versus low grade or absent), in the recognition of mucinous type and in the detection of recurrences. The median value of intraclass correlation coefficients for features was 0.92.
With regard to the front of tumor growth on VIBE_FA10, the best performance was obtained with HHL_glcm_MaximumProbability with an accuracy of 81%; while on VIBE_FA30, the best performance was with HLH_glcm_MaximumProbability (the same feature of previous case obtained with another wavelet filter HLH respect to HHL) with an accuracy of 84%. Among significant features to differentiate the tumor budding on VIBE_FA10, the best predictot was again the HHL_glcm_MaximumProbability with an accuracy of 88% while on VIBE_FA30, the best performance was of the Inverse Variance extracted by the original GLCM matrix with an accuracy of 89%.
Regarding to differentiate the mucinous type of tumor on VIBE_FA10, the best predictor was the HHH_ngtdm_Busyness with an accuracy of 84% while on VIBE_FA30, the best performance was obtained by the HHL_glszm_ZoneVariance with an accuracy of 85%.
Among the significant features to identify tumor recurrence on VIBE_FA10, the best performance was obtained by the LLH_glrlm_ShortRunEmphasis with an accuracy of 85% while on VIBE_FA30, the best predictor was the LHL_glcm_Correlation with an accuracy of 86%.
Therefore, all of the significant and better predictors for each outcome except that the Inverse Variance obtained by the original GLCM matrix were Higher-order statistics features obtained by statistical methods after wavelet transform. However, all of these metrics capture certain statistical regularities of tumor lesions through images linked to the heterogeneity of gray levels on the segmented volume of interest.
Considering a linear regression models or neural network classifiers in a multivariate approach was possible to increase the performance in terms of accuracy, sensitivity, and specificity. The best linear regression model was obtained in the identification of the front of tumor growth considering the height textural significant metrics by VIBE_FA10 (AUC of 72%, an accuracy of 89%; sensitivity of 93% and a specificity of 82%) while the best results with a KNN was obtained in the identification of recurrence with the 3 textural significant features (AUC of 91%, an accuracy of 93%; sensitivity of 99% and a specificity of 77%).
Radiomics and radiogenomics are emerging fields with important weaknesses that need to be taken into account. The main limit is the heterogeneity of software analysis and the variety of the metrics in different hospitals. Therefore, the segmentation of lesion may affect results [41].
The present study had several limitations: (1) the small population size considered, although the analysis was conducted on a homogeneous sample and on all individual lesions; (2) the retrospective nature of the study, (3) a manual segmentation, that, although research has supported automatic segmentation to avoid inter-observer variability, in our opinion, the manual approach is more realistic. Moreover, we did not assess the impact of the contrast administration and the different phases of contrast study (arterial, portal and transitional phase) respect to EOB-phase, data that we are evaluating in a future paper. However, we evaluated the impact of different flip angle (10 and 30). Additionally, we did not evaluate the impact of chemotherapy on our data.

Conclusions
Ours results confirmed the capacity of radiomics to identify, as biomarkers, several prognostic features that could affect the treatment choice in patients with liver metastases, in order to obtain a more personalized approach. These results were confirmed by external validation dataset. We obtained a good performance considering the single textural significant metric in the identification of front of tumor growth (expansive versus infiltrative) and tumor budding (high grade versus low grade or absent), in the recognition of mucinous type and in the detection of recurrences.