Suboptimal Performance of Hepatocellular Carcinoma Prediction Models in Patients with Hepatitis B Virus-Related Cirrhosis

This study aimed to evaluate the predictive performance of pre-existing well-validated hepatocellular carcinoma (HCC) prediction models, established in patients with HBV-related cirrhosis who started potent antiviral therapy (AVT). We retrospectively reviewed the cases of 1339 treatment-naïve patients with HBV-related cirrhosis who started AVT (median period, 56.8 months). The scores of the pre-existing HCC risk prediction models were calculated at the time of AVT initiation. HCC developed in 211 patients (15.1%), and the cumulative probability of HCC development at 5 years was 14.6%. Multivariate Cox regression analysis revealed that older age (adjusted hazard ratio [aHR], 1.023), lower platelet count (aHR, 0.997), lower serum albumin level (aHR, 0.578), and greater LS value (aHR, 1.012) were associated with HCC development. Harrell’s c-indices of the PAGE-B, modified PAGE-B, modified REACH-B, CAMD, aMAP, HCC-RESCUE, AASL-HCC, Toronto HCC Risk Index, PLAN-B, APA-B, CAGE-B, and SAGE-B models were suboptimal in patients with HBV-related cirrhosis, ranging from 0.565 to 0.667. Nevertheless, almost all patients were well stratified into low-, intermediate-, or high-risk groups according to each model (all log-rank p < 0.05), except for HCC-RESCUE (p = 0.080). Since all low-risk patients had cirrhosis at baseline, they had unneglectable cumulative incidence of HCC development (5-year incidence, 4.9–7.5%). Pre-existing risk prediction models for patients with chronic hepatitis B showed suboptimal predictive performances for the assessment of HCC development in patients with HBV-related cirrhosis.

However, although cirrhotic patients are more likely to develop HCC by up to more than 10 times compared to non-cirrhotic patients, there is an unmet need to develop optimized models that allow for earlier intervention. However, no study has assessed the performance of recently validated HCC risk prediction in such a population. Since HCC prediction models so far have generally incorporated cirrhosis itself, or surrogate markers suggestive of cirrhosis, as major integral components, most of which were based on routine ultrasonography, clinical parameters, and non-invasive fibrosis measurements, it remains undetermined as to whether the reliable predictive performances might be maintained among a population with HBV cirrhosis.
Therefore, using a cohort with HBV-related cirrhosis, we aimed to evaluate the predictive performance of pre-existing well-validated HCC prediction models established in the era of potent AVT.

Study Design and Participants
Patients with cirrhosis, who initiated ETV or TDF as the first-line AVT for treatmentnaïve CHB between 2007 and 2018 at Yonsei University Severance Hospital, Gangnam Severance Hospital, and Yongin Severance Hospital, were retrospectively reviewed. The inclusion criteria were as follows: (1) adult patients with age ≥ 19 years, (2) who were AVT-naïve, and (3) with reliable baseline liver stiffness (LS) value measured using transient elastography (TE). The exclusion criteria were as follows: (1) without having cirrhosis, (2) history of HCC at enrollment, (3) decompensated cirrhosis with Child-Pugh class C at enrollment, (4) co-infection with other hepatitis viruses or human immunodeficiency virus, (5) history of organ transplant, (6) HCC development within 6 months of AVT initiation, and (7) other significant comorbidities (e.g., end-stage kidney disease, uncontrolled heart failure, pulmonary hypertension, and life-threatening autoimmune disease) ( Figure S1). AVT was initiated according to the practice guidelines of the Korean Association for the Study of the Liver and the reimbursement guidelines of the National Health Insurance Service of the Republic of Korea (ROK). Cirrhosis was diagnosed histologically or clinically as follows: (1) with a platelet count <150,000/µL and ultrasonographic findings suggestive of cirrhosis, including a blunted, nodular liver edge accompanied by splenomegaly (length > 12 cm), or (2) with clinical signs of portal hypertension such as gastroesophageal varices [29].
The study protocol was in accordance with the ethical guidelines of the 1975 Declaration of Helsinki and was approved by the institutional review board in each medical center.

HCC Surveillance
Patients underwent routine laboratory testing assays of serum levels of HBV-DNA, as well as liver imaging studies (e.g., ultrasonography or computed tomography) at approximately 6-month intervals after initiating AVT to screen for HCC and portal hypertensionrelated complications. LS was measured using TE (FibroScan ® , EchoSens, Paris, France), and was considered to be reliable when the procedure was performed with at least 10 valid measurements, a success rate of at least 60%, and an interquartile range (IQR)-to-median ratio of <30% in a standard manner [30].
The primary outcome was the development of HCC. HCC was diagnosed based on histological evidence or dynamic computed tomography and/or magnetic resonance imaging findings (nodules > 1 cm with arterial hypervascularity and portal-/delayed-phase washout) [31].

Calculation of HCC Risk Scores from Prediction Models
The scores of pre-existing HCC risk prediction models were calculated at the time of AVT initiation to predict HCC development after 6 months of AVT use. These models included PAGE-B [16], modified PAGE-B [17], modified REACH-B [18], CAMD [19], aMAP [32], Toronto HCC Risk Index (THRI) [33], AASL-HCC [14], HCC-RESCUE [34], PLAN-B [35], and APA-B (in patients with alpha-fetoprotein [AFP] results) [36]. In general, CAGE-B and SAGE-B are calculated using the LS value, stabilized after 5 years of AVT [20,21]. However, considering that the LS value significantly improves after 1 year of AVT [37], CAGE-B and SAGE-B scores were also calculated after, using the LS value in the patient group with follow-up TE results after 1 year of AVT, and their performances were compared with other models. Therefore, CAGE-B and SAGE-B were calculated at the time of AVT initiation to predict HCC development after 18 months of AVT use. The list of these models and the risk stratification are summarized in Table S1. Patients were stratified into the low-, intermediate-, and high-risk groups according to the previous studies that introduced each prediction model [14,[16][17][18][19][20][21][32][33][34][35][36].

Statistical Analysis
Continuous variables were expressed as medians (IQRs), and categorical variables were expressed as numbers (percentages). The statistical differences between the two groups were evaluated using Student's t test or the Mann-Whitney U test for continuous variables, and using the chi-squared test or Fisher's exact probability test, respectively, depending on their distribution. The cumulative risk of HCC development was assessed by the Kaplan-Meier method. Patients were censored from the results when they ended the follow-up, died without developing HCC, or developed other malignant diseases rather than HCC. Univariate and subsequent multivariate Cox regression analyses assessed the potential risk factors and their independent associations for HCC development, respectively, by calculating the hazard ratio (HR) and 95% confidence interval (CI).
The predictive performance of the risk scoring models for HCC development was assessed using Harrell's C-indices, time-dependent areas under the receiver operating characteristic curve (TDAUCs) at 3, 5, and 8 years from the date initiating AVT, and the integrated area under the receiver operating characteristic curve (iAUC) after 8 years. These were chosen because there were few patients who followed up for >8 years after initiating AVT. Statistical differences in the parameters for predictive performances between the model with highest iAUC and other HCC risk prediction models were evaluated using the bootstrap method, with re-sampling done 1000 times. If the 95% CI contains zero, there is no significant difference in parameters for predictive performances between two models.
To calculate the PLAN-B model, we used Python programming language (version 3.11; Python Software Foundation, Wilmington, DE, USA) and assessed the shared source code that is available online at https://github.com/vitaldb/planb/blob/main/predict.ipynb (accessed on 25 November 2022) [35]. All statistical analyses were conducted using R software (version 4.2.1, http://cran.r-project.org/) (accessed on 15 August 2022). Twosided p values < 0.05 were considered to be statistically significant.
During a median follow-up period of 56.8 (IQR 35.6-75.3) months, HCC developed in 211 (15.1%) patients (3.41 per 100 patient-years) and the cumulative 3-, 5-, and 8-year probabilities of HCC development were 7.4%, 14.6%, and 31.7%, respectively. Patients who developed HCC showed significantly older age (55 vs. 53 years); higher HBeAg positivity (47.4% vs. 36.7%); lower platelet count and serum albumin level; and higher values of baseline and follow-up LS (14.3 vs. 10.3 kPa, and 11.8 vs. 8.7 kPa, respectively), compared to those without HCC ( Table 2). The median scores for the pre-existing predictive models for HCC development were significantly higher in patients who developed HCC than in those who did not ( Table 2).

Risk Stratification in Cirrhotic Patients with CHB
Patients were stratified into low-, intermediate-, and high-risk groups according to the models, which showed that the risk of HCC development increased in the high-risk group of each model (all log-rank p < 0.05) (Figure 1). There were more than 10% of patients who stratified into the low-risk group according to the modified REACH-B, PLAN-B, APA-B, and SAGE-B (13.8-24.5%), and the risk was significantly or tended to be lower than that in the intermediate-and high-risk groups (all log-rank p < 0.05, except for APA-B [p = 0.050]). However, these patients also showed a high cumulative incidence of HCC (5-year incidence, 4.9%-7.5%), even when stratified into the low-risk group (Table 5). confidence interval; AUC, area under the receiver operating characteristic curve; TDAUC, area of the time-dependent receiver operating characteristic curve; HCC, hepatocellular carcinoma.

Risk Stratification in Cirrhotic Patients with CHB
Patients were stratified into low-, intermediate-, and high-risk groups according to the models, which showed that the risk of HCC development increased in the high-risk group of each model (all log-rank p < 0.05) (Figure 1). There were more than 10% of patients who stratified into the low-risk group according to the modified REACH-B, PLAN-B, APA-B, and SAGE-B (13.8-24.5%), and the risk was significantly or tended to be lower than that in the intermediate-and high-risk groups (all log-rank p < 0.05, except for APA-B [p = 0.050]). However, these patients also showed a high cumulative incidence of HCC (5-year incidence, 4.9%-7.5%), even when stratified into the low-risk group (Table 5).

On-Treatment LS Value in Cirrhotic Patients with CHB
The baseline characteristics of patients who had TE data after 1 year of AVT and did not develop HCC within 18 months after AVT (n = 808) are summarized in Table S4. The median value of on-treatment LS was 8.8 kPa. Patients with an on-treatment LS value ≥8.8 kPa had a higher risk of HCC development than the others (unadjusted hazard ratio = 2.252, 95% CI, 1.500-3.383, p < 0.001). The 2-, 3-, 5-, and 8-year cumulative incidences of HCC development were 1.6%, 3.6%, 11.0%, and 23.7% in patients with on-treatment LS value <8.8 kPa, respectively, and 3.6%, 10.5%, 19.9%, and 56.0% in patients with ontreatment LS value ≥8.8 kPa, respectively (log-rank p < 0.001).

Discussion
To date, several risk-scoring systems have been proposed to predict the development of HCC in patients with CHB. In the current era of potent AVT where the virologic effects can be easily suppressed, most of the recently established systems adopted the presence of baseline cirrhosis or fibrotic burden, and generally demonstrated high negative predictive values to exclude HCC development within about 10 years [38]. However, because cirrhosis itself is a strong predictor [39], the predictive power of the proposed models is expected to decrease somewhat in the cirrhosis group, which has a common fibrotic burden [40].
In the present study, age, platelet count, serum albumin level, HBeAg positivity, and LS value remained independent or tended to be associated with HCC development in patients with HBV-related cirrhosis. However, regardless of the presence of cirrhosis as a component in the scoring system, several of the models introduced, partially based on these factors, showed attenuated predictive performance for HCC development in the subgroup with HBV cirrhosis (all Harrell's c-index and iAUC < 0.7). These findings are similar to those of previous studies that have attempted to develop predictive models for patients with HBV cirrhosis. Cheng et al. [41] reported that the predictive performance of CU-HCC, PAGE-B, modified PAGE-B, and their suggested HCC-nomogram using albumin-bilirubin score at 1-year of AVT in 277 treatment-naïve patients with HBV cirrhosis was very limited (0.505-0.611). Nam et al. [42] also reported that the PAGE-B, CU-HCC, HCC-RESCUE, ADRESS-HCC, mPAGE-B, and THRI models showed very poor performance (c-index of all models < 0.6) in 424 patients, compared to that of their suggested deep neural network model (c-index: 0.782). Huang et al. [43] contrary demonstrated that the GAG-HCC, REACH-B, and TW1 models showed acceptable AUCs (0.747-0.797) by 5 years after AVT, however, the study might be insufficient to reflect the realities of the current era due to the relatively small number of participants (n = 226) who were treated with lamivudine or adefovir.
Patients with HBV cirrhosis have a higher risk of HCC than those without cirrhosis [39]. Since most of the patients in our study were clinically diagnosed with cirrhosis using ultrasonography and clinical parameters, there might be higher possibilities of the over-estimation of cirrhosis. Since most of the patients in our study were clinically diagnosed with cirrhosis using ultrasonography and clinical parameters, there might be higher possibilities of over-estimation of cirrhosis, when compared to diagnosis by non-invasive fibrosis tests, such as TE, Fibrotest, or the enhanced fibrosis test [44]. However, most participants were stratified into moderate-or high-risk groups by most scoring systems. Therefore, the reported annual incidence of HCC at 3.41 per 100 patient-years was higher than the recommended criteria for the biannual HCC surveillance strategy (≥1.5% in cirrhosis) [45]. Moreover, even though patients were sufficiently (>10% of total) classified as low-risk by the models that did not have cirrhosis components in their equations (e.g., modified REACH-B, APA-B, and SAGE-B), they showed a non-negligible 5-year cumulative incidence of HCC (6.7-7.5%). This was quite different from the previously reported 5-year cumulative HCC incidences (<1.0%) in patients with CHB, regardless of the presence of cirrhosis. Even patients with an LS value that improved to less than 8.8 kPa after 1 year of AVT also showed a high 5-year cumulative incidence rate (11.0%). These findings indicate that the candidates needing HCC surveillance, along with the optimal methods in terms of diagnostic modalities and/or interval among the so-called "at-risk" population, should not be determined solely based upon HCC prediction models.
In the present study, modified REACH-B, using the LS component, showed significantly or tended to have higher c-index and iAUC than the other models. However, the model using LS value (modified REACH-B, CAGE-B, and SAGE-B) did not continuously show the higher TDAUCs at 1, 2, and 3 years after AVT initiation. Considering that patients with liver cirrhosis are at risk of developing HCC, even within a relatively short period of time after follow-up, the superiority of the model cannot be quickly determined. This is the case even if the c-index or integrated AUC is high in the modified REACH-B.
Notably, multivariate Cox regression analyses revealed that the known risk factors for HCC development in patients with CHB on AVT, such as old age, low platelet count, low serum albumin level, and high LS value by TE [22], were still independently associated with HCC development in patients with HBV cirrhosis. Moreover, patients who showed a very high cumulative incidence of HCC development were classified as a high-risk group by the models containing all or some of these risk factors, such as modified PAGE-B, modified REACH-B, CAGE-B, and SAGE-B (5-year: 15.5-24.0%, and 8-year: 42.4-52.8%). Therefore, even though cirrhosis itself can degrade the discriminating power of the variables constituting the existing predictive models in the HBV-related cirrhosis group, patients with cirrhosis, who are older, have low platelet counts, or show high LS values, should undergo stricter surveillance for HCC development, compared to those who without cirrhosis.
The present study has several limitations. First, the findings were potentially subject to selection bias owing to the retrospective nature of the study. To overcome this limitation, the study was conducted using three tertiary referral hospital-based cohorts with a statistically reliable sample size and follow-up duration. Second, since we primarily adopted the diagnostic criteria of cirrhosis based upon the ultrasonography findings and platelet count, a significant number of mild cases had been missed. Conversely, some of enrolled patients had a low LS value, despite being diagnosed using the above criteria. Thus, another kind of potential selection bias might occur. Further studies, based upon the more accurate diagnostic modalities, are required to overcome this issue. Third, this study did not suggest a novel risk model for HCC development in patients with HBV cirrhosis. A recently proposed deep learning model, using previously known risk factors, showed acceptable predictive power for HCC development in patients with HBV cirrhosis (c-index, 0.719-0.782); however, it did not represent an intuitive formula [42]. Fourth, the evaluation of new biomarkers for chronic HBV infection (e.g., quantitative HBV surface antigen, serum HBV RNA, hepatitis B core-related antigen, or specific HBV mutants) was limited because of the retrospective nature of our study [46][47][48]. Likewise, the role of other metabolic factors should be assessed in the further studies [11,12,49,50]. Finally, the present study cannot clarify whether this phenomenon is specific to the HBV or also present in the other etiologies. However, theoretically, since "cirrhosis" itself had been emphasized as one of the most important prognostic factors in most HCC prediction models so far, and its discriminatory ability must be statistically offset in the cohort with cirrhosis, we cautiously speculate that a similar phenomenon might be observed in patients with other chronic liver diseases. Further studies are required to address this issue.

Conclusions
In conclusion, the existing risk prediction models for patients with CHB showed suboptimal predictive performances for assessing HCC development in patients with HBV cirrhosis. These cirrhotic patients with CHB should undergo strict HCC surveillance, regardless of whether they have known risk factors for HCC development.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/diagnostics13010003/s1, Figure S1: Flowchart of the patients' selection. Table S1: Summary of HCC prediction models [14,16,17,19,20,[32][33][34][35][36]51]. Table S2: Univariate Cox regression analysis for the development of hepatocellular carcinoma. Table S3: Comparison of predictive performance between the modified REACH-B and other HCC risk prediction models. Table S4: Baseline clinical characteristics of the study population who underwent transient elastography after 1 year of antiviral therapy and did not develop HCC within 18 months after antiviral therapy.  Informed Consent Statement: Patient consent was waived due to the retrospective nature of this study.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author. The data are not publicly available due to patient privacy concerns.

Conflicts of Interest:
The authors disclosure no conflict of interest.