A Statistical Approach to the Diagnosis and Prediction of HCC Using CK19 and Glypican 3 Biomarkers

Various statistical models predict the probability of developing hepatocellular carcinoma (HCC) in patients with cirrhosis, with GALAD being one of the most extensively studied scores. Biomarkers like alpha-fetoprotein (AFP), AFP-L3, and des-g-carboxyprothrombin (DCP) are widely used alone or in conjunction with ultrasound to screen for HCC. Our study aimed to compare the effectiveness of Cytokeratin 19 (CK19) and Glypican-3 (GPC3) as standalone biomarkers and in a statistical model to predict the likelihood of HCC. We conducted a monocentric prospective study involving 154 participants with previously diagnosed liver cirrhosis, divided into two groups: 95 patients with confirmed HCC based on clinical, biological, and imaging features and 59 patients without HCC. We measured the levels of AFP, AFP-L3, DCP, GPC3, and CK19 in both groups. We used univariate and multivariate statistical analyses to evaluate the ability of GPC3 and CK19 to predict the presence of HCC and incorporated them into a statistical model—the GALKA score—which was then compared to the GALAD score. AFP performed better than AFP-F3, DCP, GPC3, and CK19 in predicting the presence of HCC in our cohort. Additionally, GPC3 outperformed CK19. We used multivariate analysis to compute the GALKA score to predict the presence of HCC. Using these predictors, the following score was formulated: 0.005*AFP-L3 + 0.00069*AFP + 0.000066*GPC3 + 0.01*CK19 + 0.235*Serum Albumin—0.277. The optimal cutoff was >0.32 (AUROC = 0.98, sensitivity: 96.8%, specificity: 93%, positive predictive value—95.8%, negative predictive value—94.8%). The GALKA score had a similar predictive value to the GALAD score for the presence of HCC. In conclusion, AFP, AFP-L3, and DCP were the best biomarkers for predicting the likelihood of HCC. Our score performed well overall and was comparable to the GALAD score.


Introduction
Liver cancer is a significant global health concern, causing 830,000 deaths annually [1]. It is the fifth most common cancer, with 854,000 new cases yearly [2]. Hepatocellular carcinoma (HCC) represents over 90% of primary liver malignant tumors. Approximately 75% of all worldwide diagnosed malignant liver tumors occur in Asia, with China accounting for 50% of the global total [3]. The etiology is different around the world; in eastern Europe, the leading causes are alcohol (53%) and hepatitis C virus (HCV) (24%) [4].
Cirrhosis is a significant risk factor for HCC [4]; the risk of developing HCC for patients with cirrhosis ranges from 1 to 8% per year [5], and approximately one-third of cirrhotic patients will develop HCC during their lifetime [6]. One-quarter of cirrhosis patients die of HCC because tumors are frequently detected at an advanced stage when treatment options are limited [7]. In addition to prevention, early detection could be the presence of HCC in the control group was ruled out using ultrasound performed by an expert in the field. In cases where there was a suspicion of HCC, further diagnostic tests such as contrast-enhanced ultrasound (CEUS), contrast-enhanced computed tomography (CE-CT), or contrast-enhanced magnetic resonance imaging (CE-MRI) were performed.
The inclusion criteria for the HCC and control groups included a positive diagnosis of cirrhosis, signed informed consent, and being over 18 years old. In addition, patients with HCC should be newly diagnosed, whether palliative or curative, without any specific treatment for HCC. An exclusion criterion for both groups was the use of oral anticoagulants.
The diagnosis of liver cirrhosis was established based on clinical signs, liver elastography (FibroScan ® , Echosens TM , Paris, France), and/or biological scores (FibroTest-ActiTest, FibroMax). HCC diagnosis was established based on the European Association for the Study of the Liver (EASL) guideline [4]: a focal liver lesion showing hypervascularity in the arterial phase and washout on the portal venous and/or delayed phases using CEUS, CE-CT, or CE-MRI [9]. The Barcelona Clinic Liver Cancer (BCLC) staging method was used to classify HCC [9].

Biomarkers Assessment
Serum samples were collected by obtaining 10 mL of whole blood using either EDTA or heparin as an anticoagulant. After centrifugation for 15 min at 1000× g at 2-8 • C within 30 min, serum samples were collected. To prevent repeated freeze-thaw cycles, serum samples were stored at −80 • C until biomarkers were assessed.

Statistical Method
MedCalc Version 19.4 (MedCalc Software Corp., Brunswick, ME, USA) and Microsoft Office Excel 2019 (Microsoft for Windows, Redmond, WA, USA) were used for the statistical analysis, while the demographic, anthropometric, and clinical data of patients' descriptive statistics were used. The distribution of numerical variables was evaluated using the Kolmogorov-Smirnov test, and continuous numerical variables with normal distributions were presented as means with standard deviations (SDs); in the case of variables with nonnormal distributions, we used the median and interquartile ranges (IQRs); the categorical variables were communicated as frequencies and percentages. The Student's t-test was utilized for group comparisons of continuous variables with normal distribution; for variables with non-normal distribution, we used a Mann-Whitney U-test. A p-value lower than 0.05 was considered significant for all statistical analyses.
Cutoffs found in papers with similar objectives were utilized to evaluate the biomarkers. For AFP, a cutoff of 20 ng/dL was applied, as it had been found to strike a balance between sensitivity and specificity in previous studies [33,34]. For AFP-L3, a cutoff value of 7% was used. It was discovered by Tamura et al. that this cutoff value best distinguishes HCC from benign liver disease [35], which was later confirmed in another paper [36]. A 40 ng/mL cutoff for DCP was used, consistent with another report that validated the GALAD model for HCC detection in Chinese patients [37]. Due to the limited number of studies on CK19, the 6.25 ng/mL cutoff proposed by El Raziky et al. was adopted [38].
Similarly, for GP3, studies that employed the same quantification method were identified, and the 0.0414 ng/mL cutoff suggested by Liu et al. was utilized [39].
Areas under receiver operating characteristic curves (AUROC) were calculated to identify discriminating cut-off values. The optimal cut-off values were calculated from the AUROC curve analysis using the Bayesian analysis, the optimal criterion (the cut-off value with the highest sum of Se and Sp), and avoiding the misclassification of true positive subjects. Positive predictive value (PPV, defined as the ratio between the true positive cases and all the positive cases), negative predictive value (NPV, defined as the ratio between the true negative cases and all the negative cases), and diagnostic accuracy (defined as the ratio between the sum of the true positive cases and the true negative cases and the total number of cases) were calculated. 95% confidence intervals (CI) were determined for each predictive test, and a value below 0.05 was considered to concede statistical significance. Univariate and multivariate regression analyses were used to find the main independent factors associated with the presence of HCC. The multivariate regression model was built using the Akaike criterion to assess the impact of several factors on the variance of continuous variables. The model was validated based on the accuracy of prediction and R squared. In the final regression equations, the predictors were accepted according to a repeated backward-stepwise algorithm (inclusion criteria p < 0.05, exclusion criteria p > 0.10) to obtain the most appropriate theoretical model to fit the collected data.

Demographic and Clinical Characteristics of Patients
One hundred fifty-four subjects with previously diagnosed liver cirrhosis were included, and 95/154 (61.7%) were diagnosed with HCC based on clinical, biological, and imaging features. According to the presence of HCC, subjects were divided into two distinct subgroups: a subgroup of subjects with liver cirrhosis and HCC (n = 95) and a control subgroup with subjects previously diagnosed with liver cirrhosis but without HCC (n = 59) ( Figure 1). The demographic and clinical characteristics of the patients are summarized in Table 1, and the laboratory results are in Table 2.
Similarly, for GP3, studies that employed the same quantification method were iden and the 0.0414 ng/mL cutoff suggested by Liu et al. was utilized [39].
Areas under receiver operating characteristic curves (AUROC) were calcula identify discriminating cut-off values. The optimal cut-off values were calculated fr AUROC curve analysis using the Bayesian analysis, the optimal criterion (the cut-off with the highest sum of Se and Sp), and avoiding the misclassification of true p subjects. Positive predictive value (PPV, defined as the ratio between the true p cases and all the positive cases), negative predictive value (NPV, defined as the ra tween the true negative cases and all the negative cases), and diagnostic accuracy (d as the ratio between the sum of the true positive cases and the true negative cases a total number of cases) were calculated. 95% confidence intervals (CI) were determin each predictive test, and a value below 0.05 was considered to concede statistical s cance. Univariate and multivariate regression analyses were used to find the main pendent factors associated with the presence of HCC. The multivariate regression was built using the Akaike criterion to assess the impact of several factors on the va of continuous variables. The model was validated based on the accuracy of predicti R squared. In the final regression equations, the predictors were accepted accordin repeated backward-stepwise algorithm (inclusion criteria p < 0.05, exclusion crite 0.10) to obtain the most appropriate theoretical model to fit the collected data.

Demographic and Clinical Characteristics of Patients
One hundred fifty-four subjects with previously diagnosed liver cirrhosis w cluded, and 95/154 (61.7%) were diagnosed with HCC based on clinical, biologica imaging features. According to the presence of HCC, subjects were divided into tw tinct subgroups: a subgroup of subjects with liver cirrhosis and HCC (n = 95) and a c subgroup with subjects previously diagnosed with liver cirrhosis but without HC 59) (Figure 1). The demographic and clinical characteristics of the patients are summ in Table 1, and the laboratory results are in Table 2.     .91], respectively. Significant differences were found between these factors in subjects with HCC compared to those without (p < 0.0001), except for CK-19 (p = 0.0763) ( Table 3). Using the following cut-off values for the subjects with HCC (n = 95): 20 ng/dL for AFP, 7% for AFP-L3, 40 ng/mL for DCP, 0.0414 ng/mL for GPC, and 6.25 ng/mL for CK19, we found out that 80/95 (84.2%) patients had elevated AFP, 66 (69.5%) patients had elevated AFP-F3, and 94 (98.5%) patients had elevated DCP. All three tumor markers were elevated in 52 (54.7%) patients.
In a more comprehensive analysis, the 15 patients with HCC who had AFP values below the cut-off value (20 ng/dL) were categorized based on their DCP levels. Among these 15 HCC patients with low AFP, 12 had high DCP levels (>40 ng/mL), thereby increasing the proportion of patients accurately identified as having HCC from 84.2% to 96.8%. When CK-19 and GPC3 values were used together, the proportion of patients correctly classified as having HCC rose from 84.2% to 92.6%.
The performance of AFP, AFP-F3, DCP, GPC3, and CK-19 for predicting the presence of HCC established in our cohort, with their corresponding sensitivities and specificities, is summarized in Table 3. DCP performed better than GPC3 (p < 0.0001) and CK-19 (p < 0.0001). No significant differences were found between AFP and AFP-L3, while both AFP and AFP-L3 performed better than GPC3 (p < 0.0001) and CK-19 (p < 0.0001). Furthermore, GPC3 performed better in predicting the presence of HCC than CK-19 (p = 0.0067) (Figure 2).

GALAD and GALKA Scores for Predicting the Presence of HCC
GALAD score was calculated for all the subjects included in our study, with a median value of-1. 34   Univariate and multivariate regression analyses were employed to develop a new prediction score using the previously mentioned markers.

GALAD and GALKA Scores for Predicting the Presence of HCC
GALAD score was calculated for all the subjects included in our study, with a median value of-1. 34   Univariate and multivariate regression analyses were employed to develop a new prediction score using the previously mentioned markers.
In the univariate analysis, we observed a significant difference between patients with or without HCC regarding the following parameters: age (p < 0.001), ALT (p < 0.001), AFP (p < 0.001), AFP-L3 (p < 0.001), DCP (p < 0.001), GPC3 (p < 0.001), CK-19 (p < 0.001), serum bilirubin levels (p = 0.001), and serum albumin levels (p = 0.001). Multivariate regression analysis was used to identify factors associated with HCC and used a significance level of Univariate and multivariate regression analyses were employed to develop a new prediction score using the previously mentioned markers.
No significant differences were found between the predictive performance of the GALKA score proposed by our study and the GALAD score (p = 0.792) (Figure 4).

Discussion
HCC is a significant health problem due to its high mortality rate [1]. As with other neoplastic diseases, early diagnosis is crucial in addressing this issue, which can be

Discussion
HCC is a significant health problem due to its high mortality rate [1]. As with other neoplastic diseases, early diagnosis is crucial in addressing this issue, which can be achieved through a sustainable screening program. The EASL [4] and the American Association for the Study of Liver Diseases (AASLD) [9] guidelines recommend an ultrasound examination every six months, with or without AFP, to screen for HCC. Conversely, the Asian [8,40] guidelines suggest the use of biomarkers. There is considerable disagreement in the scientific and medical communities regarding the use of blood biomarkers for monitoring patients at risk of developing HCC. The problems are related to various sensitivities and specificities among studies, the heterogeneity of groups, and various cutoffs. In our study, we evaluated five biomarkers from the same sample and their ability to diagnose HCC.
In our study, AFP levels in the HCC group were significantly higher than in the cirrhosis group (<0.0001). These results are consistent with other studies [41,42]. However, the cutoff value used affects sensitivity and specificity. In our study, using a cutoff of 20 ng/mL for HCC diagnosis, we obtained a sensitivity of 74.7% and a specificity of 100% for AFP. AFP also had the highest AUC (0.94) among the five biomarkers tested. Marrero et al. reported similar sensitivity (59%) and specificity (90%) using a cutoff of 20 ng/mL, which is consistent with our findings [33]. In a recently published meta-analysis, AFP's pooled sensitivity and specificity were 61% and 87%, respectively [34]. Using a higher cutoff is associated with a decrease in sensitivity, such as 22% for 200 ng/mL [43] and 18% for 400 ng/mL [44]. The excellent specificity of AFP may be due to the fact that over half of HCC patients have a lesion larger than 5 cm or multiple nodules. However, low-AFP HCC patients can still develop large HCCs, indicating that variables other than AFP play a role in determining HCC size [45]. Pang et al. suggested combining AFP with other biomarkers to improve diagnostic efficiency [34]. In our study, adding DCP to evaluate patients with normal/low levels of AFP increased the correct classification rate to 96.8%, while adding CK-19 levels and Glypican increased the correct classification rate to 92.6%.
Consequently, ongoing efforts are being made to identify novel blood biomarkers for HCC, with several new ones having been discovered in recent decades. However, only a few have been adopted in clinical practice-those that are less invasive, easy to replicate, and produce highly consistent results.
In a meta-analysis of six articles with 2447 patients, AFP-L3 was found to have a specificity of 92% and sensitivity of 34%, with an AUC of 0.75 for diagnosing early HCC [46]. In our study, we observed that the specificity of AFP-L3 was 91.5%, the sensitivity was 75.8%, and the AUC was 0.91, suggesting that its sensitivity increases with the larger size and a more advanced stage of HCC. Although AFP-L3 is not very sensitive for early HCC, it has a high specificity that cannot be matched by total AFP and has the added benefit of distinguishing HCC from benign liver disorders in individuals with increased serum AFP [46]. AFP-L3 was also strongly associated with complications of HCC, such as portal vein invasion and intrahepatic metastasis [47].
DCP is another serum biomarker studied for its utility in diagnosing HCC. In our study, DCP had the third-highest AUC at 0.82, performing worse than AFP and AFP-L3. Marrero et al. obtained a similar AUC of 0.72, concluding that AFP was more sensitive than DCP and AFP-L3% [41]. However, DCP has been demonstrated to have a low sensitivity in detecting preclinical HCC, only 26.3% [36]. Compared to these data, in our study, the sensitivity was higher, and we believe that the advanced stage and larger size of the HCC were determinants of this high sensitivity. DCP performs better when combined with other biomarkers, as shown in a phase II study [48] and also observed in our study.
GPC3 is a member of the glypican family of glycosylphosphatidylinositol-anchored cell-surface heparan sulfate proteoglycans. In our study, GPC3 performed better only when compared with CK-19 and was less accurate in diagnosing HCC than AFP, with an AUC of 0.72 vs. 0.94. This inferiority of GPC3 contradicts previously reported data where it was superior to AFP, with sensitivity and specificity of 84-85% and 92-95% vs. 50-79% and 80-90% [22]. One possible explanation for the results may be that GPC3 is more effective than AFP in detecting early-stage liver cancer and is not correlated with tumor size [39]. While the specificity of GPC3 was found to be very good in our study, the relatively low sensitivity observed compared to other studies [22,39] could be attributed to the measurement method used [22] as well as the size of the sample batch [39]. Further research may be needed to validate these findings and to explore other factors that may contribute to the sensitivity of GPC3 as a biomarker for liver cancer. The combination of AFP and GPC3 seems warranted for achieving high accuracy. [39].
GPC3 expression may be higher in females than men since the GPC3 gene is X-linked and situated in the Xq26 region [49]. Interestingly, in subjects without HCC, GPC3 median values were significantly higher in females than males, a phenomenon explained by the fact that about 25% of X-linked genes may escape X chromosome inactivation to some extent. Further studies are needed to investigate the impact of gender on GPC3 levels and, thus, the potential need for sex-specific cutoffs.
CK19 is an HCC stem cell marker involved in carcinogenesis, metastasis, and recurrence [50]. Our study showed no significant difference in serum CK19 levels among patients with and without HCC. Moreover, CK19 performed less accurately in diagnosing HCC than all other biomarkers in our study. Similar data were reported by Raziky et al., with reported sensitivity and specificity of 63.4% and 55%, respectively, who concluded that combining AFP with CK19 offers high sensitivity [38]. When using the combination in our study, we obtained better specificity but with a loss of sensitivity.
Published data showed that using a combination of biomarkers could lead to better detection of HCC. The GALAD (gender, age, AFP-L3, AFP, and DCP) score combines serum-based markers (AFP, AFP-L3, and DCP) with demographic information (gender and age) [51]. Therefore, we evaluated the performance of the GALAD model for diagnosing HCC in our group. Our results showed that GALAD had high accuracy for HCC detection, with an AUC of 0.98. We used GALAD's concepts to create a new HCC diagnostic modelthe GALKA score. The model included AFP, AFP-L3, Glypican, CK19, and serum albumin levels, variables that showed the best model for predicting the presence of HCC (p < 0.001). Compared with GALAD, we added serum albumin to our multivariate logistic regression analysis, an independent factor associated with developing HCC that reflects synthetic liver function.
In our current study, the GALKA score model had a higher AUC (0.98) and a higher sensitivity and specificity in diagnosing HCC compared to other biomarkers evaluated. Compared with GALAD, GALKA had a similar AUC; however, the sensitivity and specificity were higher, at 96.8% and 93% vs. 93.7% and 91.5%, respectively.
Combining four HCC-specific biomarkers and one protein reflecting liver function has a synergistic effect, improving overall diagnostic accuracy. Furthermore, the absence of DCP from our score eliminates interference with anticoagulants when applying the score in the real world.
In a recent paper, Li et al. validated the GALAD score and compared it with other scores and biomarker combinations, demonstrating the superiority of GALAD (AUC 0.925, 0.945) [37]. Furthermore, a systematic review published by Guan et al. supports the robust power of GALAD as an HCC screening or diagnostic tool [52]. However, a phase 3 biomarker study from the United States showed that GALAD's performance was modest and not different from AFP-L3 alone or carcinoma early detection screening (HES) [53]. GALAD's performance is also influenced by the etiology of the chronic liver disease of the patient at high risk for HCC, with decreased performance in HBV etiology and higher pooled sensitivities and AUC values seen in HCV and non-viral liver disease patients [52]. However, the number of HCV cirrhotic patients is decreasing due to direct-acting antivirals (DAA), while NAFLD is increasing. Nevertheless, GALAD has a good AUC of 0.91 in detecting NAFLD-associated HCC [54].
GALAD's sensitivity and AUC are associated with the BCLC stage [52]. Considering that approximately half of our cohort is outside the Milan criteria, this may explain the high AUC and sensitivity of GALAD in our cohort. However, the GALAD score performed better in predicting the size of HCC in our study, leading us to hypothesize that the GALKA score could show different performances in a cohort of patients with early-stage HCC. This hypothesis requires validation in a larger patient cohort with early-stage HCC.
Some limitations need to be acknowledged. Firstly, regarding the sample size, we need a larger cohort of patients with HCC and a larger control group to evaluate our score better. Secondly, a cost-effectiveness analysis should have been performed. Although many new technologies for diagnosing HCC are emerging, they come at a higher cost. Future cost-effectiveness analyses can determine whether our score is cost-efficient for diagnosing HCC.
However, given that other scores perform differently in each etiology of patients at risk of HCC, future studies are necessary to divide patients according to their etiology. In conclusion, for the validation and understanding of the place of our score in the diagnosis of HCC, we need more extensive studies with early-stage HCC divided according to their etiologies.

Conclusions
AFP has been demonstrated to have the best performance in predicting the likelihood of HCC (AUC 0.94), followed by AFP-L3 and DCP. The integration of GPC3 and CK19 into the GALKA score has also shown an overall strong performance (AUC 0.98), similar to the GALAD score, with no significant differences (p = 0.792). However, future studies are needed to validate the GALKA score in cohorts of patients with early HCC and to categorize them according to the etiologies of cirrhosis.