Validation of Pre-/Post-TACE-Predict Models among Patients with Hepatocellular Carcinoma Receiving Transarterial Chemoembolization

Simple Summary Transarterial chemoembolization (TACE) is used to treat patients with intermediate stage hepatocellular carcinoma (HCC). However, models to accurately predict survival are lacking. The aim of our retrospective study was to attempt to validate the prognostic performance of the newly proposed Pre- and Post-TACE-Predict models with Korean patients. In our study of 187 patients with HCC who underwent TACE, there was no significant difference between the Pre- and Post-TACE prediction models in HCC patients. Additionally, simple scoring prognosis prediction models performed similarly to or better than the Pre- and Post-TACE-Predict models in our study. Thus, simple scoring prognosis prediction models such as modified hepatoma arterial embolization prognostic (mHAP)-II and SNACOR may be useful in assessing the TACE treatment survival over the Pre- and Post-TACE-Predict models in patients with HCC. Abstract This study attempted to validate the prognostic performance of the proposed Pre- and Post-TACE (transarterial chemoembolization)-Predict models, in comparison with other models for prognostication. One-hundred-and-eighty-seven patients with HCC who underwent TACE were recruited. Regarding overall survival (OS), the predictive performance of the Pre-TACE-Predict model (one-year integrated area under the curve (iAUC) 0.685 (95% confidence interval (CI) 0.593–0.772)) was better than that of the Post-TACE-Predict model (iAUC 0.659 (95% CI 0.580–0.742)). However, there was no significant statistical difference between two models at any time point. For comparison between models using pre-treatment factors, the modified hepatoma arterial embolization prognostic (mHAP)-II model demonstrated significantly better predictive performance at one year (iAUC 0.767 (95% CI 0.683–0.847)) compared with Pre-TACE-Predict. For comparison between models using first TACE response, the SNACOR model was significantly more predictive at one year (iAUC 0.778 (95% CI 0.687–0.866) vs. 0.659 (95% CI 0.580–0.742), respectively) and three years (iAUC 0.707 (95% CI 0.646–0.770) vs. 0.624 (95% CI 0.564–0.688), respectively) than the Post-TACE-Predict model. mHAP-II and SNACOR may be preferred over the Pre- and Post-TACE-Predict models, respectively, considering their similar or better performance and the ease of application.


Introduction
Hepatocellular carcinoma (HCC) is the third most common cause of cancer-related death [1]. In cases of early-stage diagnosis, curative treatments for HCC, including surgical resection, orthotopic liver transplant, or local ablation, have been shown to be feasible. In contrast, patients diagnosed with intermediate or advanced HCC are generally treated with palliative modalities, including transarterial chemoembolization (TACE) or systemic chemotherapy, such as sorafenib, regorafenib, and atezolizumab/becacizumab [2,3]. Among these, TACE, which is based on induction of focal ischemia with local delivery of chemotherapeutic agents [4], has been shown to improve survival more than supportive care in randomized controlled trials [5,6]. Thus, current guidelines recommend TACE as the standard treatment for patients with intermediate stage (Barcelona Clinic Liver Cancer (BCLC) stage B), multi-nodular HCC who are not candidates for curative treatment [7,8].
However, several unresolved issues remain. First, there is widespread use of TACE outside of recommended guidelines; more specifically, early stage or advanced stage with portal vein tumor invasion, which hinders an accurate assessment of long-term clinical outcomes after TACE [9,10]. Additionally, due to the heterogeneity of patients with BCLC stage B HCC, the decision-making process regarding treatment modalities tends to depend on the physician's discretion rather than the simplified algorithm based on practice guidelines [11]. As such, many studies have attempted to further sub-classify such patients to guide the optimal treatment strategy and prognostication after TACE treatment [12,13]. Among the prognosis prediction models developed to date, the hepatoma arterial embolization prognostic (HAP) score consists of a point system according to tumor size, alpha-fetoprotein (AFP), bilirubin, and albumin [14]. This was further improved by Park et al., with the addition of tumor number, as the modified HAP-II (mHAP-II) [15]. Although both predictors were effective and easy to use, one limitation was that they did not address the treatment response after TACE, given that it may substantially alter the final outcome of patients with HCC. Therefore, many studies have attempted to develop models that include HCC response to TACE. For example, "SNACOR," developed by Kim et al., is a prognosis prediction model that accounts for radiological response after the first TACE session [16]. More recently, Han et al. [17] proposed an individualized TACEspecific prognosis prediction model using widely available clinical features and response to first TACE therapy-more specifically, Pre-TACE-Predict and Post-TACE-Predict modelsboth of which demonstrated superior predictive performance compared with HAP and mHAP-II scores.
In the present study, we attempted to externally validate the prognostic performance of the newly proposed Pre-TACE-Predict and Post-TACE-Predict models compared with other prognosis prediction models in an independent cohort of Korean patients with HCC undergoing TACE.

Patients
Data from patients with treatment-naïve HCC, who underwent TACE between 2003 and 2015 at the Severance Hospital, Yonsei University College of Medicine (Seoul, Korea), were included in the present study. Exclusion criteria are reported in Figure 1. HCC was diagnosed using histological or radiological methods according to current practice guidelines [4,5]. According to the etiology of HCC, patients were categorized as follows: hepatitis B virus (HBV); hepatitis C virus (HCV); and other, which covered HCC arising from chronic liver diseases other than HBV and/or HCV. were initially included in our study. After the exclusion of 135 patients due to our exclusion criteria, a total of 187 patients were ultimately included for the statistical analysis. TACE, transarterial chemoembolization.

TACE Procedure and Assessment of Treatment Responses
Angiography of the hepatic artery and superior mesenteric artery was performed to confirm portal vein patency, vascular anatomy, and tumor vascularity. TACE was performed using 5 mL of iodized oil contrast medium (lipiodol), and 50 mg of either adriamycin or cisplatin (2 mg/kg body weight). The mixture was infused selectively at the subsegmental or segmental branch of the feeding arteries, followed by embolization using gelatin sponge particles. Dynamic liver computed tomography (CT) was used to detect residual tumors and, if detected, sequential TACE procedures were scheduled at 6-8week intervals when extrahepatic metastases, critical portal vein invasion, and deterioration in clinical status or laboratory values were not detected.
Treatment response was radiologically assessed 4 weeks after the initial TACE procedure according to the modified Response Evaluation Criteria in Solid Tumors (mRE-CIST), criteria, as described by Lencioni et al. [18]. The four mRECIST categories included complete response (CR), partial response (PR), stable disease (SD), and progressive disease (PD). Viable tumors were assessed using the uptake of contrast materials during the arterial phase of dynamic CT or magnetic resonance imaging. Retention of iodized oil and necrotic lesions without intra-tumoral arterial enhancement was regarded to be a necrotized tumor foci. CR was defined as complete disappearance of measurable lesions, whereas PR was defined as a 30% decrease from baseline. PD was defined as a 20% increase from the baseline, and SD was defined as the value between PD and PR. Two experienced radiologists with considerable experience read the examinations, both blinded to each other's results and clinical data. Then, ultimately, final classifications made by consensus between 2 observers were adopted for analysis.

Calculation of Prognosis Prediction Models after TACE
The HAP, mHAP-II, and SNACOR models were used to evaluate the prognosis among patients with HCC after TACE. Detailed scoring and classification are summarized in Table 1. Calculations for Pre-TACE-Predict and Post-TACE-Predict models and cutoffs are described by Han et al. [17] and are as follows: Pre-TACE-Predict linear predictor = 0.313 × tumor number (0 = solitary, 1 = multifocal) + 1.252 × log10 tumor size (cm) + 0.230 × baseline log10 AFP (ng/mL) − 0.0176 × baseline albumin (g/L) + 0.458 × baseline log10 bilirubin (μmol/L) + 0.437 × VI (0 = no, 1 = yes) + 0.149 × HBV (0 = no, 1 = yes) + 0.333 The study was conducted according to the guidelines of the Declaration of Helsinki. Given the retrospective nature of the study and the use of anonymized patient data, requirements for informed consent were waived.

TACE Procedure and Assessment of Treatment Responses
Angiography of the hepatic artery and superior mesenteric artery was performed to confirm portal vein patency, vascular anatomy, and tumor vascularity. TACE was performed using 5 mL of iodized oil contrast medium (lipiodol), and 50 mg of either adriamycin or cisplatin (2 mg/kg body weight). The mixture was infused selectively at the subsegmental or segmental branch of the feeding arteries, followed by embolization using gelatin sponge particles. Dynamic liver computed tomography (CT) was used to detect residual tumors and, if detected, sequential TACE procedures were scheduled at 6-8 week intervals when extrahepatic metastases, critical portal vein invasion, and deterioration in clinical status or laboratory values were not detected.
Treatment response was radiologically assessed 4 weeks after the initial TACE procedure according to the modified Response Evaluation Criteria in Solid Tumors (mRECIST), criteria, as described by Lencioni et al. [18]. The four mRECIST categories included complete response (CR), partial response (PR), stable disease (SD), and progressive disease (PD). Viable tumors were assessed using the uptake of contrast materials during the arterial phase of dynamic CT or magnetic resonance imaging. Retention of iodized oil and necrotic lesions without intra-tumoral arterial enhancement was regarded to be a necrotized tumor foci. CR was defined as complete disappearance of measurable lesions, whereas PR was defined as a 30% decrease from baseline. PD was defined as a 20% increase from the baseline, and SD was defined as the value between PD and PR. Two experienced radiologists with considerable experience read the examinations, both blinded to each other's results and clinical data. Then, ultimately, final classifications made by consensus between 2 observers were adopted for analysis.

Statistical Analysis
Categorical variables are expressed as number and percentage (n [%]), and continuous variables are expressed as median (interquartile range [IQR]). Overall survival (OS) was calculated as the difference between the date of TACE treatment and the date of death or the last follow-up. If overall survival of 50% was not reached, values for 95% confidence interval (CI) were labeled as not applicable (NA). The Shapiro-Wilk test was used to assess for normality of distribution. Survival time was calculated using the Kaplan-Meier method. The primary endpoint of this study is the predictive performance of the prognosis prediction models for OS which is calculated using Heagerty's integrated area under the curve (iAUC) from 1000 bootstrap resamples. For comparison of the primary outcome, if the 95% confidence interval (CI) for the difference between models at each time point included zero, there was no statistical difference between the performances of the models. In addition, regarding detailed specifications of Pre-and Post-TACE-Predict score for prognostication, we calculated Harrell's C index, Gönen and Heller's K, Royston-Sauerbrei's R2D, Akaike information criterion, homogeneity measured using the likelihood ratio chi-squared test, and discriminatory ability measured using the linear trend chi-squared test [19,20]. Statistical analysis was performed using SPSS version 25.0 (IBM Corporation, Armonk, NY, USA) and R (V.4.0.1, http://cran.r-project.org/; accessed on 15 September 2021). Differences with p < 0.05 were considered to be statistically significant.

Baseline Characteristics
After the exclusion of 135 patients, 187 HCC patients who underwent TACE treatment were included in the present study ( Figure 1). Baseline demographics of the included patients are summarized in Table 2. The median age of the study population was 59 years, with 136 (72.7%) males and 51 (27.3%) females. The median OS of the entire cohort was 33.4 (IQR 11.7-81.7) months. The main causes of HCC in this patient population included HBV, HCV, and others, accounting for 127 (67.9%), 31 (16.6%), and 29 (15.5%) patients, respectively.
Among the overall population, the median time to progress was 8.7 months (95% CI 7.253-10.147). Among patients who achieved CR after 1st TACE, the treatment response was maintained for the median duration of 16.8 months (IQR 7.1-32.1).

OS According to the Pre-and Post-TACE-Predict Models
Similarly, patients with a lower risk when stratified according to Pre-TACE-Predict and Post-TACE-Predict models demonstrated a better median OS: 130.0 (95% CI 128.  Among the overall population, the median time to progress was 8.7 months (95% CI 7.253-10.147). Among patients who achieved CR after 1st TACE, the treatment response was maintained for the median duration of 16.8 months (IQR 7.1-32.1).

Time Point Death/Number of Patients Analyzed iAUC (95% CI) Difference in iAUCs Pre-TACE-Predict
Post

Comparison of the Predictive Performance among Models with and without Post-TACE Parameters
Among the models using only pre-treatment parameters, mHAP-II demonstrated the highest predictive performance for one-year iAUC (0.767 (95% CI 0.683-0.847)) compared to the Pre-TACE-Predict (0.685 (95% CI 0.593-0.772)) and HAP (0.718 (95% CI 0.627-0.809)) models. There was a significant difference between the performance of the Pre-TACE-Predict and mHAP-II models at the one-year time point (iAUC difference -0.082 (95% CI −0.170 to −0.003)). In addition, there was no significant difference between the Pre-TACE-Predict and HAP models at any time point (Table 6). Table 6. Predictive performance of prognosis prediction models from the baseline and after the 1st TACE.  Among the models using the first TACE response, the SNACOR model demonstrated better predictive performance at the one-year iAUC (0.778 (95% CI 0.687-0.866)) compared to the Post-TACE-Predict model (0.659 (95% CI 0.580-0.742)). The SNACOR model was significantly more predictive at the one-year (iAUC difference, 0.119 (95% CI 0.008-0.223)) and three-year (iAUC difference, 0.084 (95% CI 0.001-0.161)) time points (Table 6).

Discussion
To the best of our knowledge, the present study was the first to externally validate the prognostic performance of the Pre-TACE-Predict and Post-TACE-Predict models compared with other TACE prognosis prediction models in an independent South Korean patient cohort with HCC. In the present study, the prognostic performances of the Pre-TACE-Predict and Post-TACE-prediction models were suboptimal, with one-year iAUCs of 0.685 and 0.659, respectively. When comparing model performances, Pre-TACE-Predict outperformed the Post-TACE-predict model in multiple analyses, including Harrell's C index (0.6495 vs. 0.6314, respectively) and Gönen and Heller's K (0.6273 vs. 0.6236, respectively). In addition, the simpler scoring models, such as the mHAP and SNACOR models, demonstrated comparable or even better performances, with a one-year iAUC of 0.767 and 0.778, respectively, when compared to the somewhat complex equations of the Pre-TACE-Predict and Post-TACE-Predict models. This is quite different from the findings of Han et al., in whose study cohort the Pre-and Post-TACE-Predict models performed significantly better than other conventional models.
Although the initial goal of our study was to validate the Pre-and Post-TACE-Predict models, the prediction performances of the models were not satisfactory. When compared to one another, the performance of the Pre-TACE-Predict model appeared to be greater than that of the Post-TACE-Predict model. This may be due to underlying differences in the model equations because the Pre-TACE-Predict model includes variables for etiology, which may have influenced the results in our HBV-dominant cohort. Additionally, both Pre-and Post-TACE-Predict models performed similarly or worse when compared to wellvalidated, simple prognosis prediction models. Although numerous variables are used in the pre-and post-TACE prediction models, it appears that the conventional tumor factors used in the simple scoring models are ultimately more effective in predicting survival. This finding is further supported by the fact that the performances of the Post-TACE-Predict and the SNACOR models were similar to those of models that did not incorporate the response after first TACE, indicating the significance of the conventional tumor factors used in the simple scoring models. Accordingly, it may be more clinically applicable for physicians to use simpler models, rather than the complex Pre-and Post-TACE-Predict models.
Our study had several strengths. First, the median OS of our cohort was 33.4 months, which was approximately equivalent to historical data from the international practice guidelines (i.e., approximately 40 months for patients with BCLC stage B) [21]. The survival timeline similarity with internationally accepted guidelines strengthens the generalizability of our results. In contrast, a cohort study by Han et al. established that the Pre-Predict and Post-TACE-Predict models had a median OS of 19.9 months. A significant difference in tumor characteristics between the two studies may explain the different survival rates. In fact, the median tumor size in our cohort was 2.9 cm, which was smaller than that in the cohort studied by Han et al., which ranged from 3.0 to 8.5 cm, depending on the subgroup. At our institute, intermediate stage or advanced HCC patients with vascular invasion, significant tumor burden of more than 10 cm, or infiltrative HCC were more likely to be treated with transarterial radioembolization [22] or liver-directed concurrent chemoradiotherapy [23]. Consequently, TACE has been frequently applied to patients with with nodular HCC presenting with relatively lower heavy tumor burden at our institute. Our therapeutic strategy may have affected the prognostic accuracy of the Pre-and Post-TACE-Predict models, which rely on variables such as vascular invasion and bilirubin, which are usually more strongly associated with patients with more advanced HCC.
Furthermore, our study highlights the need for updated guidelines to guide TACE treatment. The patients in our study, classified as BCLC stage B, were highly heterogeneous. Previous studies have proposed sub-classification of BCLC stage B HCC for detailed prognostication and corresponding treatment [24]. Bolondi et al. [25] proposed four sub-stages (B1-B4) with the addition of portal vein thrombosis and Child-Pugh score. Validation studies investigating the sub-classification reported statistically significant median survival differences between sub-stages, strengthening the need for updated guidelines and therapies for BCLC stage B. Although more evidence is required, patients determined as being early stage B should be down staged and recommended curative treatment if possible. Furthermore, patients determined to have high-risk BCLC stage B and poor responders to TACE should be switched to targeted systemic therapies [26,27]. Additionally, although patients in our cohort had a smaller tumor burden, only 2% were determined to be stage 1 in the Pre-TACE-Predict models, which could indicate that the Pre-TACE-model is not adequate for identifying low-risk HCC patients. Conversely, risk models, such as HAP and mHAP-II, exhibited a more even distribution between low-and high-risk patients in our study, which may support the potential benefit of these prognosis prediction models in determining treatment strategies for low-risk patients.
We also acknowledge several issues that remain unresolved. First, due to the retrospective nature of our study, the decision to perform TACE according to strict guidelines was not possible. In fact, in our study, patients who were diagnosed with BCLC stage C underwent TACE treatment. Thus, due to the tendency to use TACE outside of the BCLC stage B criteria, we believe that a large-scale validation and comparison study of the pre-and post-TACE prediction models with BCLC stage sub-analysis is necessary. Second, because all study participants were recruited from one institute, further studies are necessary to determine the applicability of the models to other institutes or countries. Third, the results could be strengthened by analysis of additional radiological or serological biomarkers that were not collected in our study. For example, additional analysis using the lectin Lens culinaris agglutinin binding glycoform of AFP (AFP-L3), derived only from cancer cells, which has been considered specific to HCC [28,29] or diffusion-weighted imaging, which is effective in tumor response assessment after TACE [30][31][32], may provide new insights into the outcomes and response to TACE. Finally, the overall sample size of our study was not large. Thus, potential statistical errors may have impacted the results. In addition, the number of patients in the Pre-TACE-Predict model group 1 was low. This may have been due to the high percentage of patients with underlying HBV infection as well as high initial AFP levels. As such, further large-scale studies are necessary to validate our findings.

Conclusions
There was no significant difference between the Pre-and Post-TACE prediction models in HCC patients. Additionally, simple scoring prognosis prediction models, such as mHAP-II and SNACOR, performed similarly to or better than the Pre-and Post-TACE-Predict models in our study. When considering the ease of application and better performance, models using a point system may be preferred over the Pre-and Post-TACE-Predict models in patients with HCC.