Comparative Analysis of Subclassification Systems in Patients with Intermediate-Stage Hepatocellular Carcinoma (Barcelona Clinic Liver Classification B) Receiving Systemic Therapy

Background: Intermediate-stage hepatocellular carcinoma (BCLC B HCC) occurs in a heterogeneous group of patients and can be addressed with a wide spectrum of treatments. Consequently, survival significantly varies among patients. In recent years, several subclassification systems have been proposed to stratify patients’ prognosis. We analyzed and compared these systems (Bolondi, Yamakado, Kinki, Wang, Lee, and Kim criteria) in patients undergoing systemic therapy. Methods: We considered 171 patients with BCLC B HCC treated with sorafenib as first-line systemic therapy in six Italian centers from 2010 to 2021 and retrospectively applied the criteria of six different subclassification systems. Results: Except for the Yamakado criteria, all the subclassification systems showed a statistically significant correlation to overall survival (OS). In the postestimation analysis, the Bolondi criteria (OS of subgroups 22.5, 11.9, and 6.6 mo, respectively; C-index 0.586; AIC 1338; BIC 1344) and the Wang criteria (OS of subgroups 20.6, 11.9, and 7.0, respectively; C-index 0.607; AIC 1337; BIC 1344) presented the best accuracy. Further analyses of these two subclassification systems implemented with the prognostic factor of alpha-fetoprotein (AFP) > 400 ng/mL have shown an increase in accuracy for both systems (C-index 0.599 and 0.624, respectively). Conclusions: Intermediate-stage subclassification systems maintain their predictive value also in the setting of systemic therapy. The Bolondi and Wang criteria showed the highest accuracy. AFP > 400 ng/mL enhances the performance of these systems.


Introduction
Primary liver tumors are the third leading cause of cancer-related deaths worldwide.They represent the sixth most common cancer; the most frequent histological type is hepatocellular carcinoma (HCC) [1].
In Western countries, the most used staging system is the Barcelona Clinic Liver Cancer (BCLC), which provides an estimation of prognosis and treatment of choice [2].
According to this staging system, patients are divided into "very-early" and "early stage" (BCLC 0-A), who are eligible for curative treatment (such as surgery, transplantation, and percutaneous treatments); "intermediate stage" (BCLC B), who should undergo transarterial procedures; "advanced stage" (BCLC C), who are recommended to receive systemic therapy; and "terminal stage" (BCLC D), who should be managed with only supportive care.
According to this staging system, the "intermediate stage" (BCLC B) is characterized by multinodular disease beyond the Milan criteria (single nodule ≤ 5 cm or up to three modules all ≤ 3 cm), without radiological signs of macrovascular invasion or extra-hepatic spread.In addition, patients should present preserved liver function, based on the Child-Pugh score, and good general conditions, based on the Eastern Cooperative Oncology Group Performance Status (ECOG-PS 0) [2].
For these patients, the BCLC algorithm suggests as a standard of care transarterial chemoembolization (TACE), but according to the concept of treatment stage migration, those patients fulfilling the transplant criteria or after successful downstaging may be eligible for surgery or transplantation.The candidacy for transplantation for HCC has been further extended with the validation of the Up-to-7 criteria.This new model, proposed by the same authors of the Milan criteria, showed a good 5-year overall survival if the sum of the number of tumor nodules and the size of the largest tumor was ≤7 at the time of transplantation [3].
On the other hand, BCLC B patients that are not amenable or refractory to locoregional treatments are referred for systemic therapy [2].
Based on its definition and the wide treatment possibilities, intermediate-stage HCC represents a very heterogeneous disease, and choosing the best treatment option could be challenging.For this reason, several subclassification systems have been proposed (Table 1) [4].First of all, Bolondi et al. proposed to subdivide intermediate HCC into four subgroups: stage B1 comprises patients within the Up-to-7 criteria, preserved liver function (Child-Pugh 5-7) and ECOG-PS 0; stage 2 comprises patients beyond Up-to-7 criteria, Child-Pugh A5-6 and ECOG-PS 0; stage 3 comprises patients beyond Up-to-7 criteria, Child-Pugh B7 and ECOG-PS 0; stage 4 comprises patients with decompensated liver function (Child-Pugh B8-9) and/or mild compromission of cancer-related general conditions (ECOG-PS 1) [5].These subclassification criteria have been further investigated by several authors, with controversial results [6,7].
In the following years, novel subclassification systems for intermediate HCC have been proposed.
Yamakado et al. subdivided BCLC B HCC according to the number of lesions (up to four nodules), size of the largest nodule (up to 7 cm), and liver function (Child-Pugh A vs. B).Based on intra-hepatic tumor burden and liver function, patients were divided into four substages.Despite the fact that the B1 stage had better survival compared to the further stages, no significant difference was observed among the continuous stages [8].Kudo et al. proposed the Kinki criteria, a simplified version of the Bolondi criteria (the Bolondi B2 and B3 stages are unified in the Kinki B2 stage).This subclassification provides more therapeutic strategies and recommends radical treatments as the first option for selected patients [9].In the validation study, proposed by the same group of authors, a significant difference among continuous subgroups was confirmed, but no significant difference was observed between BCLC A vs. B1 and BCLC B3 vs. C [10].
Wang et al. validated the Bolondi criteria and proposed a novel subclassification system, adding serum alpha-fetoprotein (AFP) levels as a prognostic factor.AFP > 200 ng/mL was considered as negatively related to survival.A significant difference in survival was reported among continuous substages after the application of these modified criteria [11].
Lee et al. proposed a subclassification similar to that of Yamakado, based on tumor burden and liver function.This simplified version prioritizes tumor size (up to 5 cm of the largest nodule), dividing patients into three subgroups, with a significant difference among continuous substages [12].
Kim et al. proposed a modification of the Bolondi subclassification system by using the Up-to-11 criteria instead of the Up-to-7 one for the tumor burden measurement.With this new substaging system, they achieved a significant difference in survival among continuous substages following TACE [13].
Lastly, Kimura et al. proposed a novel subclassification system, dividing patients into three subgroups according to the Up-to-7 criteria and the combination of serum levels of AFP and des-r-carboxyl prothrombin (DCP) [14].
All of the previously cited subclassification systems proposed a treatment of choice for each intermediate substage.To the best of our knowledge, there are no studies investigating the accuracy of these subclassification criteria in predicting survival in patients with intermediate-stage HCC undergoing systemic therapy.
The aim of this study is to compare the prognostic accuracy of these subclassification systems in a large cohort of patients treated with systemic therapy for intermediate HCC.

Design of the Study
This study is a retrospective analysis, performed using medical records from a prospective multicenter registry concerning unresectable HCC patients treated with sorafenib as first-line systemic therapy.This database includes patients from six Italian centers (IR-CCS Azienda Ospedaliero-Universitaria di Bologna, Bologna; Ospedale degli Infermi, Faenza; Cardarelli Hospital, Naples; Papa Giovanni XXIII Hospital, Bergamo; Azienda Ospedaliero-Universitaria Pisana, Pisa; Humanitas Clinical and Research Center, Milan).Co-investigators from each participating center entered and updated data every 3-6 months.The coordinator center checked data for internal consistency.
For this study, we considered patients with intermediate-stage HCC (BCLC B) who started sorafenib from January 2010 to December 2021.The closing follow-up date was 31 July 2023, allowing an adequate follow-up period.
The decision to consider only a single drug (sorafenib) was made in order to obtain data from a homogeneous study population.Also, selecting sorafenib offered the dual advantage of recruiting a particularly large number of treated patients and having a long follow-up available (since sorafenib was licensed more than ten years ago).While these aspects might seem of marginal importance at first, especially when dealing with drugs which have been associated with a short survival, there are two elements which strongly supported our decision.First, intermediate-stage HCC patients represent a minority of the whole category of patients receiving systemic therapies, both in clinical trials and in reallife populations.Therefore, very large populations of patients who underwent systemic treatments are needed to obtain a fair number of intermediate-stage HCC patients.Second, BCLC B stage is a known favorable prognostic factor for patients receiving a systemic treatment.Both "ECOG-PS 0" and the composite variable "macrovascular invasion and/or extrahepatic spread" (conditions discriminating intermediate from advanced stage) are commonly used in clinical trials as stratification factors.Therefore, intermediate-stage HCC patients receiving systemic therapies usually experience prolonged survival compared to their advanced-stage counterpart.Therefore, longer follow-up periods are needed to fully explore factors associated with overall survival in this population.

Baseline, Subclassification, and Re-Evaluation
Baseline characteristics including sex, age, ECOG-PS, laboratory findings (including full blood cell count, coagulative parameters, serum creatinine, aspartate aminotransferase, alanine aminotransferase, total bilirubin, albumin, and AFP), and liver disease characteristics (etiology of the underlying liver disease, presence or absence of ascites, and hepatic encephalopathy) were present for all patients.A Child-Pugh score was calculated for each patient.
In all patients, a baseline contrast-enhanced CT scan of the thorax and abdomen was performed within 30 days before the start of sorafenib.Variables considered to describe tumor burden included: number of nodules, maximum tumor diameter, distribution of the nodules (unilobar vs. bilobar), presence or absence of biliary invasion, macrovascular invasion, and extrahepatic spread.All of the mentioned information were available for each patient.
Patients were subclassified according to the Bolondi, Yamakado, Kinki, Wang, Lee, and Kim criteria.Since serum DCP measurement was not in our daily clinical practice, it was not possible to apply the Kimura subclassification.
Of note, patients with decompensated liver function (i.e., Child-Pugh ≥ B8) are not eligible for sorafenib prescription in Italy; consequently, no patient was classified in the Bolondi B4 or Kinki B3 substages.
Radiological re-evaluation for tumor response assessment was performed every 12 weeks.Treatment response was evaluated according to the Response Evaluation Criteria In Solid Tumours (RECIST) v1.1 [15].

Management of Sorafenib
Sorafenib was generally started at the usual dosage of 400 mg bid.Dose reduction or temporary discontinuation of treatment were allowed in case of intolerable adverse events.In case of (i) clinical and radiological progression of disease, (ii) severe toxicity, or (iii) significant liver function deterioration, sorafenib was permanently discontinued.

Statistical Analysis
Categorical and continuous variables were expressed as absolute and relative frequencies and as mean and standard deviation, respectively.The chi-squared test and the Student's t test were used for comparison between groups for categorical and continuous variables, respectively.
Overall survival (OS) was measured from the start of sorafenib treatment until patient death, the last follow-up visit, or the end of the follow-up period (whichever occurred first).The Kaplan-Meier method was used to estimate survival curves.
Variables presenting a statistically significant correlation (p < 0.05) with OS in the univariate Cox analysis were included in a time-dependent covariate Log-rank test, in order to define the variables independently correlated with survival.
For each prognostic model, we tested both the discriminatory performances (i.e., the differences in survival across different stages) and the gradient monotonicity (i.e., the decreases in survival from the best to the worst stage).
The Akaike information criterion (AIC) and the Bayesian information criterion (BIC) were used to assess the discriminatory abilities.Lower AIC and BIC scores indicated a better goodness of fit of the score.
The concordance Harrel C-index was used both as a further test for discriminatory ability and to evaluate the gradient monotonicity of the scores.Higher C-index scores indicated a better performance, with 0.7 being used as a threshold to define a good performance of the model.

Study Population
Out of the 741 patients included in the database, for this study we considered 171 patients (23.0%) with intermediate-stage HCC.Most patients were males (80.1%) and had underlying cirrhosis (95.9%).The mean age at the beginning of systemic therapy was 69.0 ± 9.1 years old and chronic viral infection was the etiology of the underlying liver disease in 70.8% of cases.The majority of patients (92.3%) presented preserved liver function (i.e., Child-Pugh A), while the remaining patients were all in the Child-Pugh B7 class.The baseline characteristics of the study population, focusing on variables of the subclassification systems, are summarized in Table 2.

Survival Analysis and Stratification According to Subclassification Systems
The univariate analysis of OS showed that all of the considered parameters concerning the intra-hepatic tumor burden were associated with worse prognosis (Table 3).The Up-to-7 criteria, Up-to-11 criteria, largest size nodule > 5 cm, and N4-S7 criterion were significantly related to OS, with a HR of 2.069, 1.422, 1.526, and 1.465, respectively.During sorafenib therapy, concomitant or sequential treatments were allowed.Among the study population, seven patients underwent TACE as palliative treatments to reduce tumor burden (all these patients experienced progression of the disease, but second-line systemic therapy was not available at that time); only one patient presented an objective response leading to conversion to liver transplantation.
After the application of the different subclassification systems, survival analyses were performed for each substage (Table 4 and Figure 1).

Postestimation Analysis of Subclassification Systems
The postestimation analysis for the accuracy of subclassification systems in predicting survival showed that the Harrel C-index ranged from 0.560 to 0.607.All the subclassification systems presented a similar C-index of 0.563 ± 0.003, with the exception of the Bolondi and Wang criteria, showing values of 0.586 and 0.607, respectively (Table 5).
The AIC analysis confirmed the higher performance of the Bolondi and Wang criteria (1338 and 1337, respectively); the BIC analysis further confirmed the superiority of these two systems, without differences in prognostic performance (1334 for both criteria).

Evaluation of Subclassification Systems According to Alpha-Fetoprotein
Based on the aforementioned results showing the superiority of the Bolondi and Wang subclassification systems, we further scrutinized these two systems.According to the literature on prognosis factors for HCC undergoing systemic therapy, we increased the AFP cut-off up to 400 ng/mL.This threshold (n = 46, 26.9% of the study population) confirmed a statistically significant correlation with survival (OS 7.1 vs. 17.4 mo, HR 1.898, p < 0.001).Hence, we stratified patients adopting this AFP value for both the Wang and Bolondi criteria (Table 6 and Figure 2).

Postestimation Analysis of Subclassification Systems
The postestimation analysis for the accuracy of subclassification systems in predicting survival showed that the Harrel C-index ranged from 0.560 to 0.607.All the subclassification systems presented a similar C-index of 0.563 ± 0.003, with the exception of the Bolondi and Wang criteria, showing values of 0.586 and 0.607, respectively (Table 5).The AIC analysis confirmed the higher performance of the Bolondi and Wang criteria (1338 and 1337, respectively); the BIC analysis further confirmed the superiority of these two systems, without differences in prognostic performance (1334 for both criteria).

Evaluation of Subclassification Systems According to Alpha-Fetoprotein
Based on the aforementioned results showing the superiority of the Bolondi and Wang subclassification systems, we further scrutinized these two systems.According to the literature on prognosis factors for HCC undergoing systemic therapy, we increased the AFP cut-off up to 400 ng/mL.This threshold (n = 46, 26.9% of the study population) confirmed a statistically significant correlation with survival (OS 7.1 vs. 17.4 mo, HR 1.898, p < 0.001).Hence, we stratified patients adopting this AFP value for both the Wang and Bolondi criteria (Table 6 and Figure 2).For the Wang subclassification system, after this modification, the median overall survival of the mB1, mB2, and mB3 substages was 22.5, 15.2, and 6.6 months, respectively, and the statistical significance of the subclassification system was maintained.Moreover, the postestimation analysis showed that the modified Wang criteria had a better performance than the original ones (C-index 0.624; AIC 1331; BIC 1337).
For the Bolondi subclassification system, we firstly stratified each substage according to AFP (Supplementary Table S1 and Supplementary Figure S1).Following these preliminary analyses, we divided patients into two groups: mB1 (Up-to-7 in or Up-to-7 out and Child-Pugh A and AFP < 400 mg/mL) and mB2 (Up-to-7 out and Child-Pugh B and/or AFP > 400 ng/mL).The median overall survival of the mB1 and mB2 substages was 19.5 and 6.6 months, respectively, with a maintained statistical significance.In the postestimation analysis, the modified Bolondi criteria showed a better performance than the original version (C-index 0.599; AIC 1335; BIC 1338).

Discussion
Among the stages proposed by the BCLC system, the intermediate stage suffers from the highest heterogeneity.This pitfall has been recently perceived even by the BCLC creators, and the last update of this system proposed different treatments for the intermediate stage, ranging from liver transplantation to systemic therapy [2].
Consequently, according to the tumor burden, liver function, and treatment choice, a patients' prognosis could range from a few months to several years.For this reason, subclassification systems have been proposed in order to better predict prognosis and to define treatment proposals tailored to the characteristics of these patients.
As aforementioned, BCLC B patients could undergo systemic therapy if they are considered not suitable for locoregional treatments.As a group, these patients have a longer overall survival compared to advanced patients (BCLC C) [16], but the individual prognosis can vary remarkably.Although several prognostic systems for advanced-stage HCC undergoing systemic therapy have been proposed [17], to our knowledge, no studies have investigated this topic in BCLC B patients.
With the exception of the Yamakado criteria, all of the available subclassification systems showed a significant difference in survival among the groups, confirming their For the Wang subclassification system, after this modification, the median overall survival of the mB1, mB2, and mB3 substages was 22.5, 15.2, and 6.6 months, respectively, and the statistical significance of the subclassification system was maintained.Moreover, the postestimation analysis showed that the modified Wang criteria had a better performance than the original ones (C-index 0.624; AIC 1331; BIC 1337).
For the Bolondi subclassification system, we firstly stratified each substage according to AFP (Supplementary Table S1 and Supplementary Figure S1).Following these preliminary analyses, we divided patients into two groups: mB1 (Up-to-7 in or Up-to-7 out and Child-Pugh A and AFP < 400 mg/mL) and mB2 (Up-to-7 out and Child-Pugh B and/or AFP > 400 ng/mL).The median overall survival of the mB1 and mB2 substages was 19.5 and 6.6 months, respectively, with a maintained statistical significance.In the postestimation analysis, the modified Bolondi criteria showed a better performance than the original version (C-index 0.599; AIC 1335; BIC 1338).

Discussion
Among the stages proposed by the BCLC system, the intermediate stage suffers from the highest heterogeneity.This pitfall has been recently perceived even by the BCLC creators, and the last update of this system proposed different treatments for the intermediate stage, ranging from liver transplantation to systemic therapy [2].
Consequently, according to the tumor burden, liver function, and treatment choice, a patients' prognosis could range from a few months to several years.For this reason, subclassification systems have been proposed in order to better predict prognosis and to define treatment proposals tailored to the characteristics of these patients.
As aforementioned, BCLC B patients could undergo systemic therapy if they are considered not suitable for locoregional treatments.As a group, these patients have a longer overall survival compared to advanced patients (BCLC C) [16], but the individual prognosis can vary remarkably.Although several prognostic systems for advanced-stage HCC undergoing systemic therapy have been proposed [17], to our knowledge, no studies have investigated this topic in BCLC B patients.
With the exception of the Yamakado criteria, all of the available subclassification systems showed a significant difference in survival among the groups, confirming their predictive value in the setting of intermediate-stage HCC treated with systemic therapy.However, all of these systems showed a low level of accuracy.
The systems all consider different variables, but they generally concern(i) tumor burden, (ii) liver function, and (iii) serum tumor markers.These choices are consistent with the univariate analysis results.
In the Log-Rank analysis, the Up-to-7 criteria showed the best correlation with survival among the other tumor burden variables.Liver function, assessed with the Child-Pugh score, did not reach statistical significance in our study, probably due to the small sample size of the Child-Pugh B group (only 13 patients).AFP, especially after adopting the cut-off of 400 ng/mL, also represented a statistically significant predictor of worse survival.In fact, in the postestimation analysis of the two subclassification systems with the best prognostic accuracy (i.e., the Bolondi and Wang criteria), the use of this new AFP threshold improved their prognostic accuracy.
Despite our data coming from a large multicenter prospective database, the sample size is still limited and the analyses are retrospective.Moreover, the small number of patients in the more advanced substages (generally characterized by initially compromised liver function and, consequently, patients are not often suitable for systemic therapies) could be a statistical issue.
Lastly, considering the period of patients' enrollment, our data depict a scenario in which second/third-line therapies did not concur in defining the final outcome of sorafenib therapy.The subsequent advent of different first-line therapies [18], furtherline therapies [19][20][21], and immunotherapy [22] have deeply changed the management of HCC patients.
In recent years, authors have stressed the concept of TACE failure/refractoriness and TACE unsuitability [23,24].In both cases, the general consensus is an early switch to systemic therapy.Several trials are now ongoing exploring the role of systemic therapy with TACE as sequential therapy, combination therapy, or conversion therapy for intermediatestage HCC patients [25][26][27].So, the treatment strategy for BCLC B HCC is rapidly evolving and patients' survival will probably be further improved.
Moreover, more recent systemic treatments such as lenvatinib and the atezolizumab/ bevacizumab combination may alter the scenario of intermediate-stage HCC.Compared with sorafenib, they have a higher objective response rate according to the RECIST 1.1 (27% for atezolizumab/bevacizumab and 21% for lenvatinib) [18,22].Objective response in intermediate-stage patients could lead to an inverse-stage migration from systemic to locoregional treatments or even to surgical resection or liver transplantation in the case of deep responses.These conversion strategies represent a currently hot topic in hepatic oncology [28,29] and can drastically improve the survival chances of intermediate-stage HCC patients receiving systemic drugs.
Therefore, the prognostic accuracy of the subclassification systems of the BCLC B stage needs to also be assessed in this new treatment scenario in order to give clinicians a benchmark for the prognostic stratification of BCLC B patients before starting systemic therapy.

Conclusions
The available subclassification systems for intermediate-stage HCC are effective in predicting survival and also in the setting of systemic therapy.Among the analyzed systems, the Bolondi and Wang criteria showed the highest level of performance in postestimation analyses, and their prognostic accuracy was improved when adopting an AFP cut-off value of 400 ng/mL instead of 200 ng/mL.

Supplementary Materials:
The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/curroncol31010038/s1,Table S1: Stratification of overall survival according to the subclassification systems; Figure S1: Kaplan-Meier curves of overall survival according to Bolondi criteria stratified for alpha-fetoprotein.

Figure 1 .
Figure 1.Kaplan-Meier curves of overall survival according to intermediate-stage hepatocellular carcinoma subclassification systems.

Figure 2 .
Figure 2. Kaplan-Meier curves of overall survival according to the modified Bolondi and Wang subclassification systems.

Figure 2 .
Figure 2. Kaplan-Meier curves of overall survival according to the modified Bolondi and Wang subclassification systems.

Table 1 .
Proposed subclassification systems for intermediate hepatocellular carcinoma.

Table 2 .
Baseline characteristics of patients with intermediate-stage HCC (BCLC B).

Table 3 .
Baseline characteristics; univariate Cox regression analysis for overall survival.

Table 4 .
Stratification of overall survival according to the subclassification systems.
Figure 1.Kaplan-Meier curves of overall survival according to intermediate-stage hepatocellular carcinoma subclassification systems.

Table 6 .
Stratification of overall survival according to the modified Bolondi and Wang subclassification systems.

Table 6 .
Stratification of overall survival according to the modified Bolondi and Wang subclassification systems.