Performance of Comprehensive Complication Index and Clavien-Dindo Complication Scoring System in Liver Surgery for Hepatocellular Carcinoma

Simple Summary The comprehensive complication index (CCI) and the Clavien-Dindo Complication (CDC) scoring system are two metrics designed to quantify the burden of postoperative morbidity. We performed a retrospective study retrieving data from a multi-institutional Italian register. The aim was to compare the performance of the two metrics in predicting excessive length of hospital stay (e-LOS) of patients who underwent liver resections for hepatocellular carcinoma. A total of 2669 patients were analyzed. A derivation (n = 1345) and validation sets (n = 1324) were created to test the strength of results. In both cohorts, the analysis showed that CCI was slightly superior in predicting e-LOS in complicated patients. The accuracy of CCI was even better when considering a subgroup of patients who experienced at least two complications. The results of this population-specific analysis suggest that CCI is preferable in weighting postoperative morbidity burden. Abstract Background: We aimed to assess the ability of comprehensive complication index (CCI) and Clavien-Dindo complication (CDC) scale to predict excessive length of hospital stay (e-LOS) in patients undergoing liver resection for hepatocellular carcinoma. Methods: Patients were identified from an Italian multi-institutional database and randomly selected to be included in either a derivation or validation set. Multivariate logistic regression models and ROC curve analysis including either CCI or CDC as predictors of e-LOS were fitted to compare predictive performance. E-LOS was defined as a LOS longer than the 75th percentile among patients with at least one complication. Results: A total of 2669 patients were analyzed (1345 for derivation and 1324 for validation). The odds ratio (OR) was 5.590 (95%CI 4.201; 7.438) for CCI and 5.507 (4.152; 7.304) for CDC. The AUC was 0.964 for CCI and 0.893 for CDC in the derivation set and 0.962 vs. 0.890 in the validation set, respectively. In patients with at least two complications, the OR was 2.793 (1.896; 4.115) for CCI and 2.439 (1.666; 3.570) for CDC with an AUC of 0.850 and 0.673, respectively in the derivation cohort. The AUC was 0.806 for CCI and 0.658 for CDC in the validation set. Conclusions: When reporting postoperative morbidity in liver surgery, CCI is a preferable scale.


Introduction
Hepatic resection offers the best chance of long-term survival for patients with resectable hepatocellular carcinoma (HCC) [1,2]. Albeit perioperative mortality following liver surgery decreased over the past decades to less than 5%, morbidity still occurs frequently in a range of 20-40% depending mainly on the extent of resection, the underlying patient liver function, and the reporting scales [3][4][5]. Therefore, it might be of value to identify objective and reproducible metrics for scaling the magnitude of complications, to achieve quality control, and to compare outcomes among institutions.
The Clavien-Dindo classification (CDC), originally described in 2004 [6], is the most broadly grading system used for weighting postoperative morbidity (more than 14,000 citations by November 2020. Scopus.com). Even if the CDC is an objective, simple, and reproducible classification, it carries the limitation of scaling the entire postoperative course by the single most serious complication occurred. To overcome this disadvantage, in 2013 the same institution proposed a new scale, the comprehensive complication index (CCI) [7], that incorporates all complications and their severity as defined by the CDC and summarizes postoperative morbidity with a numerical scale ranging from 0 to 100.
Despite that CCI and CDC scoring systems are closely related metrics, CCI allows a longitudinal assessment of morbidity because the addition of a complication, appearing at a later time-point, is added to the score. By this computation CCI appears more precise to capture the overall morbidity burden [8,9]. However, comparison of the two scoring systems have been mostly applied in studies with substantial case-mix and heterogeneous populations [10][11][12]. As it may be more desirable to analyze populations with defined intervention-specific complications, we aimed to assess the performance of CCI and CDC in predicting LOS and excessive LOS (e-LOS) in patients undergoing liver resection for HCC. Length of hospital stay (LOS) can be considered as a reliable proxy of surgical morbidity, since'complicated clinical courses generally result in a longer duration of hospitalization [13].

Study Overview and Population
Patients who underwent liver resection for HCC with curative intent between 2007 and 2018 were identified from an Italian multi-institutional database. Patient data were retrieved retrospectively from this dataset, promoted by the Hepatocarcinoma Recurrence in the Liver Study (He.Rc.O.Le.S.) group. The study protocol followed the ethical guidelines of the 1975 Declaration of Helsinki (as revised in Brazil 2013). The Ethical Committee of the coordinating center (University of Milano-Bicocca; San Gerardo Hospital, Monza) reviewed and approved the protocol (211218HSG-R) on 21 December 2018. Inclusion criteria were: (1) first diagnosis of HCC without any previous treatment; (2) age ≥ 18 years; (3) HCC diagnosis confirmed at histology. Exclusion criteria were: (1) surgery as a down-staging therapy for transplant; (2) patients who eventually underwent liver transplantation; (3) mixed primary liver cancers (e.g., hepatocholangiocarcinoma). All data were anonymized prior to submission to the coordinating center. Data collection was performed using the same electronic database at all centers.
Centers were randomly selected to be included in either the derivation or validation set to obtain a similar number of patients in the two cohorts.
Results are reported according to principles of Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) [14].

Variables and Definitions
Age, sex, and liver function were recorded at the first office visit. The presence of cirrhosis and its severity (Child-Pugh score) was evaluated and graded by expert hepatologists. Presence of HCV or HBV infection and serum biochemical values of bilirubin, albumin, platelets, and international normalized ratio (INR) were also recorded at baseline. The American Society of Anesthesiologists (ASA) class was assessed during the preoperative patient evaluation. The number and diameter of nodules were assessed through preoperative radiologic imaging and confirmed by intraoperative ultrasound. The extension resection was defined as minor when ≤3 liver segments were resected and major when >3 segments, according to the Brisbane nomenclature [15]. The definitions of anatomic resection (AR) and parenchyma-sparing resection (PSR) were previously reported [16]. Other surgery-related variables were minimally invasive approach, intraoperative blood transfusion, and duration of operation. LOS was calculated from the day of operation to hospital discharge. e-LOS was defined as the LOS longer than the 75th percentile among patients who experienced at least one complication. Post-operative complications grading was recorded according to both CDC [6] and CCI [7]. Post-operative mortality was calculated as the number of deaths occurring within 90 days from surgery; these patients were excluded from the calculation of LOS and e-LOS. Center volume was stratified according to the number of liver resections performed per year: ≤50 procedures identified a low-volume center, 51-100 resections a medium-volume center, and >100 procedures a high-volume center [17].

Endpoints
The primary endpoint was to assess the performance of CCI and CDC to predict LOS and e-LOS in patients undergoing liver resection for HCC.
The secondary endpoint was to find an optimal CCI cut-off value capable of predicting e-LOS.

Statistical Analysis
Sample description was performed using median and interquartile range (IQR) for numeric variables and number and proportion for categorical variables, for both derivation and validation sets.
The distribution of CCI in each CDC category was explored graphically using boxplots, while the association between each score and LOS was represented with a scatter plot, again in both datasets.
After excluding patients who died within 90 days (14 in the derivation set and 20 in the validation set), the association of each score with LOS (log-transformed) was analyzed using linear regression. We fit univariate models, including CCI or CDC as the only covariate, and multivariate models, adjusted by type of surgery (major vs. minor), open vs. laparoscopy, age (per year), ASA score (1-2 vs. 3-4), Child grade (A vs. B), duration of surgery (>4 h vs. ≤4 h), center volume (high vs. medium/low) on the derivation set. We evaluated and compared the goodness-of-fit of the models including CCI or CDC using the R-squared and the root mean squared error (RMSE) indexes. This was done on the derivation set and, on the validation set without refitting the models. The whole linear regression analysis was repeated considering, in both sets, only the subgroup of patients with at least two postoperative complications.
The association of both scores with e-LOS was analyzed using logistic regression. Again, we fit univariate and multivariate models (adjusting for the same covariates used for the analysis of LOS) on the derivation set excluding patients who died during hospital stay. The discriminatory ability of the models to identify patients with e-LOS was evaluated using the area under the ROC curve (AUC) index both on the derivation and validation (without refitting the models) sets. Again, the whole logistic regression analysis was repeated considering, in both sets, only the subgroup of patients with at least two postoperative complications.
Finally, considering only the subset of patients with at least one postoperative complication in both sets, we built an ROC curve to find the optimal cut-off (corresponding to the maximum Youden index) of CCI to be used in order to identify patients with e-LOS. We repeated the analysis within strata defined by minor/major surgery, open/laparoscopic surgery, presence/absence of cirrhosis and Child grade A/B. The R software version 4.0.1 was used for all the analyses.

Descriptive Findings
The analysis was carried out in May 2020, and at that time the records of 2917 patients, collected in the 25 centers, were entered into the register. A total of 248 records were excluded because of missing information on LOS, CDC or CCI, resulting in 2669 patients available for the final analysis. The final sample size of the derivation and validation cohorts were 1345 and 1324 records, respectively.

Association of CCI and CDC with Postoperative LOS and e-LOS
The ability of CDC and CCI to measure postoperative morbidity was evaluated in terms of association with postoperative LOS and e-LOS. Figure 2A-D show the relationship between LOS and the two scoring systems in derivation and validation sets.
The multivariate linear regression analysis on the derivation set showed that the expected mean log (LOS) change, per 10 units of CCI increment, was 0.27 (95%CI: 0.25-0.28), corresponding to an average 31% increment of LOS, while for category increment of CDC it was 0.29 (95%CI: 0.27-0.31), corresponding to an average 34% increment of LOS. The goodness of fit of the CCI model was slightly superior to the CDC model as indicated by the higher R 2 and lower RMSE, also in the validation set. Larger differences between the performance of CCI vs. CDC models were observed in the subset of patients with at least two complications ( Table 2).
The coefficients of all covariates included in the linear models are shown in Supplementary Materials Table S1.
At the multivariate logistic regression for e-LOS in the derivation set, the odds ratio for 10 units of CCI increment, was 5.60 (95%CI: 4.20-7.44) vs. 5.51 (95%CI: 4.15-7.30) for CDC category increment with a moderately higher discriminatory ability for the CCI model even in the validation set (AUC = 0.893 for CCI vs. 0.890 for CDC). CCI showed an even higher ability to discriminate patients with e-LOS than CDC in the subset of patients with at least two complications ( Table 3). The odds ratios of all covariates included in the logistic models are shown in Table S2.   The performance of CCI, evaluated by the ROC curve methodology, to identify-amongst patients with at least one complication-those with e-LOS is reported in Figure 3 (derivation and validation sets). Considering a CCI score of 22 (optimal cut-point at the Youden index) as a predictor of e-LOS, the sensitivity was 77.6% and the specificity 82.6% in the derivation set. The AUC index was 0.852, decreasing to 0.735 when the validation set was considered indicating, even so, a good discriminatory performance of CCI.
This was confirmed even when subgroups were analyzed, although slightly different cut-points were found ( Figure S1A-H).

Discussion
In grading a complicated postoperative course, the CDC scoring system [6] accounts for the most severe adverse event and this may underestimate a more encompassing representation of surgery-related morbidity. Failure to capture the number and severity of every single complication may result in a partial report of the characteristics of a postoperative course. The CCI scale has been created to outline more accurately the overall morbidity burden since it integrates in one formula all documented complications weighted by severity [7]. The values of the CCI range in a numeric scale from 0 up to 100 and thus this metric theoretically grants a wider and more differentiated grading of complications than CDC.
A comparison of the two scoring systems has been already applied in several studies investigating broad case-mix and heterogeneous populations [8,10,11,[18][19][20]. However, different types of surgical procedures expose patients to peculiar complications and different risks according to the technical details, magnitude of injury, and baseline characteristics of the population. As it may be more accurate to analyze intervention-specific complications, we aimed to assess and compare the ability of CCI and CDC to predict LOS in a more homogeneous cohort, i.e., patients undergoing liver resection for HCC. This is a worthy subset of patients to be analyzed because of the probability of encountering multiple postoperative complications, for both the complexity of surgery and the patient-inherent risks, mostly related to the underlying liver function [21][22][23].
The present study advocates that both CCI and CDC scoring systems perform well in predicting the duration of hospitalization after liver resection for HCC. However, CCI was slightly superior to CDC in predicting both overall LOS and e-LOS in complicated patients, in both the derivation cohort and in the validation one. In the validation set there were less cirrhotic patients, less CDC 0, longer LOS, less laparoscopic cases, and less anatomical resections but more high-volume centers. Center volume has been repeatedly described as a variable affecting morbidity. Since we conducted a multi-institutional study, we randomly assigned centers to the validation or the derivation cohort to limit this bias. However, the two cohorts were partially not homogeneous. Hence, all multivariate models were adjusted for several confounders including center volume.
Although the overall difference within metrics performance was marginal in the overall sample, the association between CCI and LOS held stronger than CDC especially in the subset of patients with at least two complications. This suggests that CCI better captured any event affecting longer hospitalization. These statements are supported by the results of the linear regression analysis showing that the CCI models were always slightly better than the CDC models. Similarly, the multivariate logistic regression analysis suggested that the areas under the ROC curves of the CCI models were always greater than the CDC models suggesting that CCI had a better ability to discriminate patients with longer LOS. Therefore, the overall findings imply that CCI is a more accurate system in grading the morbidity burden, eventually affecting the duration of hospitalization.
It can be argued that our results indicate a marginal advantage of CCI over CDC in predicting LOS and e-LOS in complicated patients. This can be partially explained by the low proportion of patients who experienced at least two complications (14.2%). Accordingly, when more than 85% of the population has no or only one complication, the two scores overlap by definition and so the comparison is futile [7]. For this reason, the accuracy of CCI is only marginally better when the overall cohort is considered, but this grading system, as expected, becomes more accurate in describing patients with multiple complications.
Despite the slightly better performance of CCI, this metric has some hindrances in predicting e-LOS. In fact, CCI was not evenly distributed through the scale and the dissemination tends to cluster in values embodying each grade of CDC for patients facing only one adverse event. Moreover, in patients with a postoperative course characterized by more severe complications (CDC > II), a wider spectrum of CCI values was present. This might be because severe complications are often coupled with additional minor ones. Otherwise, low-grade complications may not necessarily prolong hospitalization. Similar uneven CCI distribution has been described by Kim et al. [24] in a series of patients who underwent gastric resection for cancer.
The ROC curve analysis of CCI for complicated patients with e-LOS found a score of 22 as the optimal cut-point for defining patients who experienced a delayed hospital discharge. Thus, this CCI cut-off value can be used to dichotomize the study population into low-and high-risk of having e-LOS. A CCI score greater than 22, would mean that a patient had at least two minor complications, according to CDC (notably, at least one complication graded I and another graded II), or a major one (CDC ≥ III). The CDC grade IIIA-which often defines "major" or "severe" morbidity [6]-corresponds to a calculated CCI value of 26.2 [7]. The cut-off value of CCI obtained in our series was slightly lower. It is possible that in the context of liver resection for HCC, e-LOS might be more affected by the occurrence of multiple minor adverse events and a lower CCI score better defines a complicated postoperative course. The data of the derivation set were tested against a validation cohort. The results suggested a good performance of the CCI adding strength to the reproducibility and accuracy of the findings. However, future prospective studies are warranted to confirm this CCI value as a reliable threshold for defining excessive LOS after liver resection. External validation is appropriate because clinical management, discharge criteria, patient-related variables, and health or social care organization may substantially differ in other settings.
Few authors compared the performance of CCI and CDC in other homogeneous surgical procedures, namely, gastrostomies, intestinal resections, cystectomies, liver transplantations, esophagectomies and pancreatectomies [20,[24][25][26][27][28][29][30]. Their results are quite consistent with ours in defining CCI as a more precise scale for reporting postoperative morbidity. Reporting CCI in surgical literature may have additional value, considering its capability of monitoring outcomes for individual surgeons [25] or for investigating historical trends of departments in terms of perioperative results after hepatectomies [31].
The retrospective study design implies several drawbacks. First, the duration of hospital stay in absence of a priori definition of discharge criteria, may be affected by social and logistic factors, and by the confidence of a single patient to go home safely. However, since there is no gold standard for measuring clinical outcomes and comparing clinical implications of different complication grades, we considered LOS as a surrogate marker for surgical outcome, as in other studies [8,20,26,27]. Second, medical costs may also be a good endpoint to reflect the burden of postoperative course [9,18], but our dataset was not designed to collect economic parameters. Third, readmission rate or new admission in other hospitals or in nursing homes were not considered. These represent additional variables that can differentiate the precision of CCI from CDC in weighting the overall morbidity burden. Fourth, the relatively low rate of patients experiencing more than one complication may have somehow faded the difference in accuracy of the two scales. Fifth, a substantial amount of data used to calculate CCI was retrieved from patients operated before the publication of this scale. For this reason, CCI values were partially deducted by the registered complications in each center. Lastly, the local practice could have affected the approach and the decision-making strategy in dealing with a complication. Interventional or conservative treatments are graded differently in CCI and CDC and may per se affect LOS.

Conclusions
Even if both scales performed well in predicting LOS and e-LOS of patients undergoing liver resection for HCC, CCI was moderately superior to CDC. The results of this population-specific analysis suggest that CCI is preferable in reporting postoperative morbidity even though CDC metrics maintain acceptable accuracy.