An Interpretable Radiomics Model Based on Two-Dimensional Shear Wave Elastography for Predicting Symptomatic Post-Hepatectomy Liver Failure in Patients with Hepatocellular Carcinoma

Simple Summary Two-dimensional shear wave elastography (2D-SWE) has demonstrated predictive value for symptomatic post-hepatectomy liver failure (PHLF) in hepatocellular carcinoma (HCC). Our aim was to develop and validate an interpretable radiomics model based on 2D-SWE for predicting symptomatic PHLF in patients undergoing liver resection for HCC. We proposed an interpretable clinical–radiomics model based on both multi-patch radiomics and clinical features, which showed an AUC of 0.822 in the test cohort, higher than the clinical model (AUC: 0.684, p = 0.007), radiomics model (AUC: 0.784, p = 0.415), end-stage liver disease (MELD) score (AUC: 0.529, p < 0.001), and albumin–bilirubin (ALBI) score (AUC: 0.644, p = 0.016). The SHAP analysis showed that first-order radiomics features were the most important features for PHLF prediction. The clinical–radiomics model is useful for predicting symptomatic PHLF in HCC with high model interpretability, which may serve as a useful tool for therapeutic decision making to improve perioperative management. Abstract Objective: The aim of this study was to develop and validate an interpretable radiomics model based on two-dimensional shear wave elastography (2D-SWE) for symptomatic post-hepatectomy liver failure (PHLF) prediction in patients undergoing liver resection for hepatocellular carcinoma (HCC). Methods: A total of 345 consecutive patients were enrolled. A five-fold cross-validation was performed during training, and the models were evaluated in the independent test cohort. A multi-patch radiomics model was established based on the 2D-SWE images for predicting symptomatic PHLF. Clinical features were incorporated into the models to train the clinical–radiomics model. The radiomics model and the clinical–radiomics model were compared with the clinical model comprising clinical variables and other clinical predictive indices, including the model for end-stage liver disease (MELD) score and albumin–bilirubin (ALBI) score. Shapley Additive exPlanations (SHAP) was used for post hoc interpretability of the radiomics model. Results: The clinical–radiomics model achieved an AUC of 0.867 (95% CI 0.787–0.947) in the five-fold cross-validation, and this score was higher than that of the clinical model (AUC: 0.809; 95% CI: 0.715–0.902) and the radiomics model (AUC: 0.746; 95% CI: 0.681–0.811). The clinical–radiomics model showed an AUC of 0.822 in the test cohort, higher than that of the clinical model (AUC: 0.684, p = 0.007), radiomics model (AUC: 0.784, p = 0.415), MELD score (AUC: 0.529, p < 0.001), and ALBI score (AUC: 0.644, p = 0.016). The SHAP analysis showed that the first-order radiomics features, including first-order maximum 64 × 64, first-order 90th percentile 64 × 64, and first-order 10th percentile 32 × 32, were the most important features for PHLF prediction. Conclusion: An interpretable clinical–radiomics model based on 2D-SWE and clinical variables can help in predicting symptomatic PHLF in HCC.


Introduction
Hepatocellular carcinoma (HCC) ranks as the fifth most common malignancy and the third leading cause of cancer-related death globally [1].Liver resection serves as the primary curative approach for eligible HCC patients [2].Despite advances in surgical techniques and perioperative care, post-hepatectomy liver failure (PHLF) remains the predominant factor behind postoperative morbidity and mortality, with an overall incidence of up to 32% and corresponding mortality of up to 5.0% [3].Moreover, PHLF occurs in the first few days after liver resection, which may necessitate some additional interventions [4].Thus, preoperative prediction of PHLF is of great importance to improve perioperative management, optimize treatment options, and avoid life-threatening events during liver resections.
PHLF primarily affects patients with liver cirrhosis who have a limited capacity for liver regeneration and diminished functional reserve of the remaining liver following resection [5].Therefore, it is crucial to accurately assess preoperative liver functional reserve for the prediction of PHLF.Several liver function indicators, including the Child-Pugh score, model for end-stage liver disease (MELD) score, albumin-bilirubin (ALBI) grade, and indocyanine green clearance (ICG) test, have been proposed for PHLF prediction, albeit with limited accuracy, with the areas under the receiver (AUCs) operating characteristic curve ranging from 0.61 to 0.76 [6][7][8][9].Two-dimensional shear wave elastography (2D-SWE) is an innovative liver stiffness measurement (LSM) technology that combines B-mode ultrasound imaging with real-time color-coded tissue stiffness mapping [10], and 2D-SWE has demonstrated excellent performance in assessing the degree of liver fibrosis [11].Previous studies have also highlighted the potential value of LSM using 2D-SWE in PHLF prediction [12,13].However, routine analyses of 2D-SWE fail to fully utilize all information available in the images and also suffer from inter-observer variance in choosing the optimal quantification region [14].A computer-aided quantitative analysis of 2D-SWE images may help overcome these limitations [15].
Radiomics is the high-throughput extraction of quantitative features from medical imaging, converting these into minable data, which can then be analyzed for use in decision support systems [16,17].Radiomics has shown great potential for the quantitative analysis of SWE images [15,18,19].Several studies have shown that radiomics models of 2D-SWE images showed a good performance in the classification of liver fibrosis [15,20].However, no previous study has evaluated the utility of radiomics for the analysis of 2D-SWE images for predicting symptomatic PHLF in patients with HCC.
Despite significant progress in radiomics, the clinical translation of artificial intelligence (AI) tools has so far been limited, partially due to a lack of interpretability of models, the so-called "black box" problem [21].Model interpretability is important for clinicians to understand the models.Post hoc interpretability methods such as Shapley Additive exPlanations (SHAP) can be used to gain insight into the decision-making process of complex classifiers in radiomics [22].The SHAP interpretability method calculates the significance of each radiomics feature, which helps the doctors understand the model.
Thus, this study aimed to evaluate the feasibility of radiomics model based on 2D-SWE for predicting symptomatic PHLF in patients undergoing liver resection for HCC.Furthermore, we studied the utility of SHAP for the interpretability of the radiomics model.

Patients
The protocol of this prospective study was approved by the Institutional Review Board of the First Affiliated Hospital of Sun Yat-sen University in China.Written informed consent was obtained from all patients before their enrollment.Patients who were candidates for curative liver resection for HCC between August 2018 and October 2022 were enrolled in this study.The diagnosis of HCC was determined according to the American Association for the Study of Liver Diseases (AASLD) Clinical Practice Guidelines for HCC (Edition 2018) [23], and the staging of HCC was determined in accordance with Barcelona Clinic Liver Cancer (BCLC) staging (Edition 2018) [24].The inclusion criteria were as follows: (1) patients with resectable and treatment-naive HCC and (2) patients with a performance status Eastern Cooperative Oncology Group (PS) score of 0-1.The exclusion criteria were as follows: (1) patients who did not undergo liver resection; (2) patients with a pathological diagnosis of non-HCC; (3) failure in liver stiffness measurement defined as the elastography color map was less than 75% filled or an interquartile range (IQR)/median > 30%; (4) patients with evidence of immune-active chronic hepatitis characterized by an elevation of alanine aminotransferase (ALT) levels ≥ 2 × upper limit of normal (ULN); (5) patients experiencing obstructive jaundice or the presence of intrahepatic bile ducts dilation with a diameter of >3 mm; and (6) patients with hypoalbuminemia, hyperbilirubinemia, or coagulopathy not related to the liver.Figure 1

Two-Dimensional SWE Data Acquisition
Patients underwent 2D-SWE examination within one week before surgery.A single radiologist (M.L) with more than 10 years of experience in liver ultrasound examination and more than 3 years of experience in liver 2D-SWE examination performed the examination.The radiologist was blinded to the clinical status of each patient.
The 2D-SWE examination was performed using the SuperSonic Imagine Aixplorer™ ultrasound system with Real-time ShearWave™ Elastography (SWE™) technique using a

Two-Dimensional SWE Data Acquisition
Patients underwent 2D-SWE examination within one week before surgery.A single radiologist (M.L) with more than 10 years of experience in liver ultrasound examination and more than 3 years of experience in liver 2D-SWE examination performed the examination.The radiologist was blinded to the clinical status of each patient.
The 2D-SWE examination was performed using the SuperSonic Imagine Aixplorer™ ultrasound system with Real-time ShearWave™ Elastography (SWE™) technique using a convex broadband probe (SC6-1, 1-6 MHz).Firstly, a B-mode ultrasound scan was performed to identify a suitable liver area for 2D-SWE measurement, which was well visualized, free of large vessels, and located at least 5 cm away from any lesion.Areas in the right lobe of the liver were preferred if available.When an appropriate area was located, the B-mode ultrasound mode was switched to elasticity imaging mode.The scale was set as 40 kPa, and the depth was set at 4-6 cm.The 2D-SWE box was set to 4 × 3 cm in size and was positioned 1.5-2 cm beneath the liver capsule.Patients were asked to hold their breath for 4-5 s to obtain a series of 3-10 consecutive 2D-SWE images.All images were stored in the Digital Imaging and Communications in Medicine (DICOM) format.Color filling in the 2D-SWE box that reached more than 75% was considered successful.A circular region of interest (ROI, termed Q-box) of 2 cm in diameter was placed on the most homogeneous area assessed visually to derive the mean value of elasticity.Independent mean values were obtained from each elastography image for each patient, and the median and interquartile range (IQR) values were calculated.The 2D-SWE image quality criteria were set at IQR/median < 30% [10].

Clinical Data Collection
Preoperative patient characteristics; laboratory data; and radiological data, including upper abdominal computed tomography (CT) and magnetic resonance imaging (MRI), were collected within one week before surgery.Clinically significant portal hypertension (CSPH) was defined as the presence of esophageal varices (by CT/MR) and/or platelet count <100 × 10 9 /L in association with splenomegaly [25].Splenomegaly was defined as the longest diameter of the spleen greater than 12 cm measured on coronal and axial CT/MRI images in the portal venous phase [26].The Child-Pugh score, ALBI score, and MELD score were calculated according to formulas presented in Supplementary Method S1.Total liver volume (TLV), resected liver volume (RLV), and future liver remnant volume (LRV) were assessed based on 3-dimensional reconstruction and simulation of surgical resection plan on preoperative CT or MRI imaging.LRV ratio was defined as liver remnant volume/total liver volume to represent the percentage of the remnant liver after resection.

Diagnosis and Staging of Symptomatic PHLF
The definition of PHLF followed the guideline proposed by the International Study Group of Liver Surgery (ISGLS), which defined it as an increased international normalized ratio (INR) and hyperbilirubinemia on or after postoperative day 5 [3].The severity of liver failure was categorized based on its impact on clinical treatment.Patients with PHLF grade A required no change in clinical treatment.For patients with PHLF grade B, there was a deviation from the standard treatment, but invasive therapy was not necessary.Patients with PHLF grade C required invasive therapeutic interventions.The symptomatic PHLF group was defined as those with PHLF grade B or higher, while the non-symptomatic PHLF group included individuals with PHLF grade A or those without the presence of PHLF [27].

Construction of Radiomics Models
The workflow of the construction of radiomics models is presented in Figure 2. PHLF group was defined as those with PHLF grade B or higher, while the non-symptomatic PHLF group included individuals with PHLF grade A or those without the presence of PHLF [27].

Construction of Radiomics Models
The workflow of the construction of radiomics models is presented in Figure 2. (1) Image preprocessing: A four-step process was used for preprocessing the elasticity data.First, the 2D-SWE box was automatically extracted from the DICOM images, which is a combination of elastographic images and B-mode images.The original color elasticity image was obtained by (1) Image preprocessing: A four-step process was used for preprocessing the elasticity data.First, the 2D-SWE box was automatically extracted from the DICOM images, which is a combination of elastographic images and B-mode images.The original color elasticity image was obtained by subtracting 50% of the corresponding B-mode image from the combined image, and the color elasticity image was resized to 128 × 128 pixels.Second, the circular measurement marks in the 2D-SWE images indicating the location of the Q-box were detected and replaced with the mean value of the surrounding 4 × 4 pixels.Third, the hue-match method was used for converting RGB color elasticity images to gray images [28].The raw elasticity data were encoded into color images according to the color bar displayed on the DICOM image, which had 220 pseudo-color levels from blue to red, representing elasticity modulus values from 0 to the maximum measurement (Figure 3a).The color bar was linearly subdivided into 2200 color levels, and the RGB value of the k-th level was denoted by (R k , G k , and B k ).The hue value of (R k , G k , and B k ) was computed as For a particular pixel of the color elasticity image, its hue value was computed as H e = arctan(2R-G-B, √ 3(G-B)), where (R, G, and B) were the RGB value of the pixel.We found the index 1 ≤ k ≤ 2200 that minimized the difference |H k -H e |, and the k*maximum measurement/2200 was calculated as the reconstructed elasticity data of this pixel.After pixel-by-pixel reconstruction, the color elasticity image in the RGB space (Figure 3b) was transformed into a gray image (Figure 3c) whose values varied from 0 to maximum measurement.The hue-match method was compared with other reported methods of RGB-to-gray SWE image conversion, including distance match [18], RGB three-channel methods [19], and direct conversion from RGB to gray via a formula [19].The hue-match method was chosen because of its superior performance when compared with the other methods in terms of the AUC.Fourth, an automated ROI selection of different size patches (32 × 32, 64 × 64, and 96 × 96) was performed by scanning 32 × 32, 64 × 64, and 96 × 96 pixel ROIs over the 2D-SWE image at 1-pixel spacing to produce numerous candidate ROIs.For each scale, the ROI with the smallest standard deviation (SD) of the pixel values within all candidate ROIs was selected for further analysis [29].
and direct conversion from RGB to gray via a formula [19].The hue-match method was chosen because of its superior performance when compared with the other methods in terms of the AUC.Fourth, an automated ROI selection of different size patches (32 × 32, 64 × 64, and 96 × 96) was performed by scanning 32 × 32, 64 × 64, and 96 × 96 pixel ROIs over the 2D-SWE image at 1-pixel spacing to produce numerous candidate ROIs.For each scale, the ROI with the smallest standard deviation (SD) of the pixel values within all candidate ROIs was selected for further analysis [29].(2) The radiomics model based on 2D-SWE images: Radiomics features were automatically extracted from different patches of ROI (32 × 32, 64 × 64, 96 × 96, and 128 × 128), using PyRadiomics, version 3.0.1.A total of 93 features, including first-order features and texture features, were extracted from each patch.In total, 372 features (93 for each scale) were extracted for the image after conversion via hue match.The constant features were removed in the first step of feature selection.In the second step, the feature pairs with Spearman's correlation coefficient (|r| > 0.90) were deemed as highly correlated, and the feature with the highest average correlation with all other features was removed.Recursive feature elimination (RFE) based on a random forest (RF) classifier was used as a final step for feature selection.A random forest classifierbased radiomics model was trained using the selected radiomics features to predict the probability of symptomatic PHLF in terms of the radiomics score.A five-fold cross-validation was used to fine-tune the hyperparameters.For the patient-level analysis, the (2) The radiomics model based on 2D-SWE images: Radiomics features were automatically extracted from different patches of ROI (32 × 32, 64 × 64, 96 × 96, and 128 × 128), using PyRadiomics, version 3.0.1.A total of 93 features, including first-order features and texture features, were extracted from each patch.In total, 372 features (93 for each scale) were extracted for the image after conversion via hue match.The constant features were removed in the first step of feature selection.In the second step, the feature pairs with Spearman's correlation coefficient (|r| > 0.90) were deemed as highly correlated, and the feature with the highest average correlation with all other features was removed.Recursive feature elimination (RFE) based on a random forest (RF) classifier was used as a final step for feature selection.A random forest classifier-based radiomics model was trained using the selected radiomics features to predict the probability of symptomatic PHLF in terms of the radiomics score.A five-fold cross-validation was used to fine-tune the hyperparameters.For the patient-level analysis, the median radiomics score of all the images from one patient was considered to be the radiomics score for that patient.
(3) The clinical-radiomics model based on 2D-SWE images and clinical data: Univariate and multivariate logistic analyses were performed in the training cohort to identify independent clinical predictors of symptomatic PHLF.A logistic regression model clinical-radiomics based on the radiomics score and significant clinical variables was constructed for symptomatic PHLF prediction.

Shapley Additive exPlanations
SHAP is a post hoc interpretability method that is based on game theory, and it was used for understanding the predictions made by the radiomics model.It measures the importance of each feature and its effect on the model's predicted probability in terms of SHAP values [22].SHAP summary plots provide global explanations by quantifying the impact of feature values on the model output and help in identifying the important features and their trends.SHAP dependence plots show how the model is affected by an individual feature.These dependence plots also show interaction effects between a pair of features and their resulting impact on the model output.SHAP local bar plots display the SHAP values for an individual test example, showing the impact of each feature on the model outcome.

Statistical Analysis
Statistical analyses were performed by SPSS, version 20.0.Student's t-test or the Mann-Whitney test, as appropriate, was used to compare the continuous variables in the training and test cohorts.The χ2 test was used to compare categorical variables.A two-sided p < 0.05 means that the corresponding estimate reaches a significant difference.A univariate logistic analysis was performed in the training cohort to detect significant predictors associated with symptomatic PHLF.These variables entered a stepwise multivariate logistic regression analysis to identify independent predictors for symptomatic PHLF.The clinical model was established based on independent predictors by logistic regression.Open-source Python v3.6.13 was used to implement the radiomics analysis.A detailed description of the packages and versions is given in Supplementary Table S1.The AUCs were compared using the DeLong test between different models.The thresholds of each model were set at the highest Youden index in the training cohort.The patient-level performance metrics, including the accuracy, sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV), of the models were evaluated and reported.A nomogram was constructed based on the clinical-radiomics model.Calibration curves were plotted to analyze the calibration performance of the different models in the test set.A decision curve analysis was conducted in the test set to determine the clinical usefulness of the nomogram by quantifying the net benefits at different threshold probabilities.

Baseline Characteristics
A total of 345 patients were enrolled, of which 305 were males and 40 were females, with a median age of 55.0 (IQR 47.0-64.0)years (Figure 1).There were 265 patients in the training cohort and 80 patients in the test cohort.
The baseline characteristics of the training and test cohorts were summarized in Table 1.A total of 107 patients (31.0%) experienced symptomatic PHLF, including 97 patients with PHLF grade B and 10 patients with PHLF grade C. Six patients with PHLF grade C died of acute liver failure within 20 to 39 days after surgery.Symptomatic PHLF was observed in 80 (30.1%) patients and 27 (33.8%)patients in the training and test cohorts, respectively, showing no significant difference.There were significant differences in the prothrombin time (PT) level (p = 0.002), international normalized ratio (INR) level (p < 0.001), and MELD score (p = 0.012) between the training and test cohorts.

Performance of the Clinical Model
The multivariate logistic regression analysis showed that the INR, CSPH, and LRV ratio were significant independent predictors of symptomatic PHLF (all p < 0.05; Table 2).These three variables were included to establish the clinical model.The clinical model showed an AUC of 0.809 (95% CI: 0.715-0.902)and 0.684 in the five-fold cross-validation and the test cohort, respectively.

Performance of the Radiomics Model and the Clinical-Radiomics Model in the Test Set
In the test set, the AUC, accuracy, sensitivity, specificity, PPV, and NPV of the radiomics model were 0.784 (95% CI: 0.720-0.898),0.725, 0.660, 0.754, 0.581, and 0.816, respectively (Table 4).The AUC, accuracy, sensitivity, specificity, PPV, and NPV of the clinicalradiomics model were 0.822 (95% CI: 0.720-0.898),0.750, 0.704, 0.773, 0.612, and 0.836, respectively (Table 4).The clinical-radiomics model showed a significantly higher AUC than the clinical model (AUC: 0.684, p = 0.007), as well as some clinical indices related to symptomatic PHLF prediction, such as the MELD score (AUC: 0.529, p < 0.001) and ALBI score (AUC: 0.644, p = 0. 016) (Figure 5b).The clinical-radiomics model showed a higher AUC than the radiomics model (AUC: 0.784, p = 0.415), without significant difference.The nomogram of the clinical-radiomics model is shown in Figure 6a.Good calibration was achieved for the clinical-radiomics model in the test set (Figure 6b), and the decision curve for the clinical-radiomics model showed a higher net benefit for the clinical-radiomics model than for the clinical model and the radiomics model when the threshold probability was between 0.10 and 0.58 (Figure 6c).

Shapley Additive exPlanations
The global SHAP summary plot identified the first-order maximum 64 × 64, first-order 90th percentile 64 × 64, and first-order 10th percentile 32 × 32 as the most important features for symptomatic PHLF prediction.These features had a similar trend: a higher

Shapley Additive exPlanations
The global SHAP summary plot identified the first-order maximum 64 × 64, firstorder 90th percentile 64 × 64, and first-order 10th percentile 32 × 32 as the most important features for symptomatic PHLF prediction.These features had a similar trend: a higher feature value resulted in a high positive SHAP value (Figure 7a), which corresponded with higher predicted probability.The fourth important feature was the gray-level co-occurrence matrix Informational Measure of Correlation (glcm_Imc1 96 × 96), and had a negative trend: a higher feature value resulted in a lower negative SHAP value.SHAP dependence plots of first-order maximum 64 × 64 and first-order 90th percentile 64 × 64 showed the relationship between the SHAP values and the feature values, as well as the interaction with another feature (Figure 7b, c).A higher value for the first-order 90th percentile 64 × 64 resulted in a higher SHAP value.However, when the glcm_Imc1 96 × 96 value was also high, the SHAP value was comparatively lower (Figure 7c). Figure 7d, e show the SHAP local bar plots for two test cases that had symptomatic PHLF. Figure 7d shows a case that was classified correctly, and the plot shows that all features except glcm_Imc1 96 × 96 made the correct contribution.Figure 7e shows a case that has been classified incorrectly by the model, and the plots show that only first-order maximum 64 × 64 and first-order minimum 32 × 32 made the correct contribution.dependence plots of first-order maximum 64 × 64 and first-order 90th percentile 64 × 64 showed the relationship between the SHAP values and the feature values, as well as the interaction with another feature (Figure 7b, c).A higher value for the first-order 90th percentile 64 × 64 resulted in a higher SHAP value.However, when the glcm_Imc1 96 × 96 value was also high, the SHAP value was comparatively lower (Figure 7c). Figure 7d, e show the SHAP local bar plots for two test cases that had symptomatic PHLF. Figure 7d shows a case that was classified correctly, and the plot shows that all features except glcm_Imc1 96 × 96 made the correct contribution.Figure 7e shows a case that has been classified incorrectly by the model, and the plots show that only first-order maximum 64 × 64 and first-order minimum 32 × 32 made the correct contribution.

Discussion
In this study, we proposed an interpretable clinical-radiomics model based on liver 2D-SWE images and clinical variables for the prediction of symptomatic PHLF in HCC patients.The clinical-radiomics model achieved an AUC of 0.822 in the test cohort, which was higher than that of the clinical model and some clinical variables, including the ALBI score and MELD score.A nomogram was established of the clinical-radiomics model for clinical use.The SHAP analysis showed that first-order statistical features were most important for model prediction, which confirmed the reliability of the developed radiomics model and helped clinicians understand the model.The results showed that a radiomics analysis of 2D-SWE images may serve as a useful tool to stratify high-risk and low-risk patients for symptomatic PHLF and to assist the surgeons in recognizing the best candidates for liver resection, determining the resection extent, and improving perioperative management.

Discussion
In this study, we proposed an interpretable clinical-radiomics model based on liver 2D-SWE images and clinical variables for the prediction of symptomatic PHLF in HCC patients.The clinical-radiomics model achieved an AUC of 0.822 in the test cohort, which was higher than that of the clinical model and some clinical variables, including the ALBI score and MELD score.A nomogram was established of the clinical-radiomics model for clinical use.The SHAP analysis showed that first-order statistical features were most important for model prediction, which confirmed the reliability of the developed radiomics model and helped clinicians understand the model.The results showed that a radiomics analysis of 2D-SWE images may serve as a useful tool to stratify high-risk and low-risk patients for symptomatic PHLF and to assist the surgeons in recognizing the best candidates for liver resection, determining the resection extent, and improving perioperative management.
Several studies have verified the utility of 2D-SWE for predicting symptomatic PHLF, with AUCs ranging from 0.72 to 0.76 [13,30].In this study, the radiomics method enabled a comprehensive analysis of 2D-SWE images and showed a better performance.The performance of the clinical-radiomics model developed in our study was higher than that of other reported predictive models, with AUCs ranging from 0.72 to 0.82 [31][32][33].
For the radiomics analysis, we proposed a multi-patch strategy that extracted coarseto-fine radiomics features and resulted in better predictive accuracy than the single-patch strategy.This result was consistent with another study showing that a multi-patch texture features analysis of ultrasound images led to better performance for liver fibrosis grading than the single-patch analysis [34].In our study, five first-order features from patches of 32 × 32 pixels and 64 × 64 pixels and two texture features from patches of 96 × 96 pixels were selected.The first-order features from smaller patches might be more informative since they avoided artifacts and noise areas within 2D-SWE images; this result was consistent with another study showing that the automatic selection of the most homogenous area for ROI improved the accuracy of liver fibrosis staging [29].Texture features from larger patches may be more informative because they are sensitive to global texture features.The clinical-radiomics model outperformed both the clinical model and radiomics model, suggesting that radiomics features and clinical features were complementary to each other.
Furthermore, a SHAP analysis was performed to understand the contribution of each radiomics feature to the radiomics signature.The global SHAP analysis identified firstorder features as the most important features for symptomatic PHLF prediction, which was quite explainable because higher first-order statistical features corresponded with higher liver stiffness, therefore leading to a higher probability of symptomatic PHLF.The results were consistent with existing studies showing that higher liver stiffness was correlated with symptomatic PHLF [12,13,30], which confirmed the reliability of the developed radiomics score.
The strengths of the radiomics analysis applied in this study were as follows.Firstly, we applied a new multi-patch strategy for radiomics analysis, which could be an efficient method for a radiomics analysis of 2D-SWE images in future studies.Secondly, we effectively combined the high-throughput 2D-SWE features with low-dimensional clinical information, which demonstrated a better predictive performance for symptomatic PHLF prediction.Thirdly, the SHAP analysis was used to improve the interpretability of the complex classifiers in radiomics, helping clinicians understand the models.Fourthly, compared with LSM, the developed multi-patch radiomics strategy fully leverages all the information contained within 2D-SWE images.Moreover, it effectively mitigates inter-observer variances, offering a more automatic, objective, and comprehensive approach.
This study has some limitations.The significant differences in the PT level, INR level, and MELD score between the training and test cohorts could potentially affect the predictive performance in the test cohort.It is a single-center study, so multicentric external validation is needed to verify its generalizability before taking steps towards clinical application.In addition, 94% of the patients enrolled were infected with hepatitis B. Therefore, the performance of the radiomics model on patients with other causes of underlying liver diseases needs further study.

Conclusions
In conclusion, the clinical-radiomics model based on 2D-SWE images and clinical variables was useful for predicting symptomatic PHLF in HCC with high model interpretability.It may serve as a useful tool for therapeutic decision making to improve perioperative management.Further prospective multicenter studies and patients with different etiologies should be considered to validate and optimize the model.

Figure 3 .
Figure 3. (a) The color bar.(b) Color elasticity image before the transformation.(c) Grayscale image after the hue match transformation of color elasticity image.

Figure 3 .
Figure 3. (a) The color bar.(b) Color elasticity image before the transformation.(c) Grayscale image after the hue match transformation of color elasticity image.

Figure 4 .
Figure 4. (a) Receiver operating characteristic curves of radiomics models of four different image preprocessing methods (distance match, hue match, RGB three channels, and direct conversion) based on the 64 × 64 pixel patch in five-fold cross-validation.(b) ROC curves of radiomics models of different patches (32 × 32, 64 × 64, 96 × 96, and 128 × 128 pixels, as well as the combination of multiple patches) based on hue-match preprocessing in five-fold cross-validation.

Figure 5 .Table 3 .
Figure 5. (a) Receiver operating characteristic curves for radiomics model, clinical model and clinical-radiomics model in five-fold cross-validation.(b) Receiver operating characteristic curves for radiomics model, clinical model, clinical-radiomics model and other clinical indices the test cohort.Table 3. Five-fold cross-validation results.

Figure 4 .
Figure 4. (a) Receiver operating characteristic curves of radiomics models of four different image preprocessing methods (distance match, hue match, RGB three channels, and direct conversion) based on the 64 × 64 pixel patch in five-fold cross-validation.(b) ROC curves of radiomics models of different patches (32 × 32, 64 × 64, 96 × 96, and 128 × 128 pixels, as well as the combination of multiple patches) based on hue-match preprocessing in five-fold cross-validation.

Figure 4 .
Figure 4. (a) Receiver operating characteristic curves of radiomics models of four different image preprocessing methods (distance match, hue match, RGB three channels, and direct conversion) based on the 64 × 64 pixel patch in five-fold cross-validation.(b) ROC curves of radiomics models of different patches (32 × 32, 64 × 64, 96 × 96, and 128 × 128 pixels, as well as the combination of multiple patches) based on hue-match preprocessing in five-fold cross-validation.

Figure 5 .Table 3 .Figure
Figure 5. (a) Receiver operating characteristic curves for radiomics model, clinical model and clinical-radiomics model in five-fold cross-validation.(b) Receiver operating characteristic curves for radiomics model, clinical model, clinical-radiomics model and other clinical indices the test cohort.Table 3. Five-fold cross-validation results.

3. 4 .
Performance of the Radiomics Model and the Clinical-Radiomics Model in the Test Set

Figure 6 .
Figure 6.(a) Nomogram for prediction of symptomatic PHLF.CSPH, clinically significant portal hypertension; INR, international normalized ratio; LRV ratio, ratio of future liver remnant volume; PHLF, post-hepatectomy liver failure.(b) Calibration curves of the radiomics model, the clinical model, and the clinical-radiomics model in the test set.(c) Decision curve analysis of the radiomics model, the clinical model, and the clinical-radiomics model in the test set.

Figure 6 .
Figure 6.(a) Nomogram for prediction of symptomatic PHLF.CSPH, clinically significant portal hypertension; INR, international normalized ratio; LRV ratio, ratio of future liver remnant volume; PHLF, post-hepatectomy liver failure.(b) Calibration curves of the radiomics model, the clinical model, and the clinical-radiomics model in the test set.(c) Decision curve analysis of the radiomics model, the clinical model, and the clinical-radiomics model in the test set.

Table 1 .
Baseline characteristics of enrolled patients.

Table 4 .
Test set results.
AUC, area under the receiver operating characteristic curve; PPV, positive predictive value; NPV, negative predictive value.

Table 4 .
Test set results.
AUC, area under the receiver operating characteristic curve; PPV, positive predictive value; NPV, negative predictive value.