Predictive Model of the Risk of In-Hospital Mortality in Colorectal Cancer Surgery, Based on the Minimum Basic Data Set

Background: Various models have been proposed to predict mortality rates for hospital patients undergoing colorectal cancer surgery. However, none have been developed in Spain using clinical administrative databases and none are based exclusively on the variables available upon admission. Our study aim is to detect factors associated with in-hospital mortality in patients undergoing surgery for colorectal cancer and, on this basis, to generate a predictive mortality score. Methods: A population cohort for analysis was obtained as all hospital admissions for colorectal cancer during the period 2008–2014, according to the Spanish Minimum Basic Data Set. The main measure was actual and expected mortality after the application of the considered mathematical model. A logistic regression model and a mortality score were created, and internal validation was performed. Results: 115,841 hospitalization episodes were studied. Of these, 80% were included in the training set. The variables associated with in-hospital mortality were age (OR: 1.06, 95%CI: 1.05–1.06), urgent admission (OR: 4.68, 95% CI: 4.36–5.02), pulmonary disease (OR: 1.43, 95%CI: 1.28–1.60), stroke (OR: 1.87, 95%CI: 1.53–2.29) and renal insufficiency (OR: 7.26, 95%CI: 6.65–7.94). The level of discrimination (area under the curve) was 0.83. Conclusions: This mortality model is the first to be based on administrative clinical databases and hospitalization episodes. The model achieves a moderate–high level of discrimination.


Introduction
Although the incidence of colorectal cancer (CRC) is very high, in recent years its prognosis has improved substantially and mortality has decreased, thanks to advances in surgical techniques and cancer therapies, as well as earlier diagnoses and the use of high quality treatment approaches [1,2].

Variables
The main study aim, to predict the risk of in-hospital mortality, was addressed by taking exitus (i.e., death during hospital stay), as the main dependent variable for all the prediction models developed.
The remaining study variables were classed as independent or predictive. The following variables were analyzed: sociodemographic (age, sex, residence), clinical comorbidities (stroke, hypertension, ischaemic heart disease, obstructive pulmonary disease, renal insufficiency, bacterial or aspiration pneumonia), and surgical comorbidities (digestive bleeding, stenosis, anastomotic dehiscence, postoperative ileus, surgical wound infections).
The following variables were also included in the analysis: preoperative hospital stay, readmissions, total number of diagnoses and procedures at discharge (NDD and NPD, respectively), length of stay, and type of admission (urgent vs. scheduled). The NDD was taken as a proxy of the degree and complexity of the comorbidities presented, and the NPD was assumed to be representative of the diagnostic-therapeutic effort. The database was purged for length of hospital stay using a moderate outlier detection procedure according to the classical method in which outliers are defined with the formula T2 = Q3 + 1.5 (Q3 − Q1), where Q identifies the third and first quartiles and T2 is the maximum value of the stay that results from applying the formula.

Statistical Analysis
The analysis was performed on the TRS except for the internal validation process, for which the operation of the final model was evaluated on the TES. All analyses were performed using the SPSS v. 17 and Stata/SE v.12 statistical packages.
A descriptive, cross-sectional study was conducted of the main variables: age, sex, stay, type of admission, readmission, comorbidities, NDD, and NPD. The continuous variables are expressed as mean ± standard deviation and the qualitative variables as percentages and frequency distributions.
The association between in-hospital mortality and each of the independent variables was identified by calculating the unadjusted-crude odds ratios (ORu). In addition, three multivariate logistic regression models were constructed from the adjusted odds ratios (ORa), using only the clinical independent variables that were identified on admission and which were statistically significant. The first bivariate logistic regression model included all of the clinical variables that were considered clinically relevant, prioritizing those observed at the time of admission ("Initial Model", Model 1) [11]. The "intermediate model" was obtained, from which the variables known to cause under-reporting bias, as shown by their low coding rate, were eliminated (Model 2). The third and "final model" constructed, termed the CSMS (Colorectal Spanish Mortality Score), was designed to be as parsimonious as possible (Model 3). The variables were extracted in sequence, according to their degree of contribution to the discrimination of the model. All three models were evaluated according to their discriminative capacity by the area under the curve (AUC) and according to their calibration, using the Pearson's χ 2 goodness-of-fit test, a standardized method based on ungrouped data.
The practical value of the Hosmer-Lemeshow (HL) test, used to evaluate the calibration, is limited by the large sample size [12,13]. Multiple-simulation studies suggest than Pearson's χ 2 goodness-of-fit test could be more powerful than the HL test [14]. The discriminative capacities of the intermediate and final models were compared, using the criterion of the maximum extraction of variables with minimum impact on the AUC in the final model.
After applying the final model to the TRS, internal validation was performed by applying the same model to the remaining 20% of the sample (TES) and examining its performance using the holdout method [15]. The AUC values for the two sets were compared using the algorithm proposed by DeLong et al. [16].
Prior to establishing this final model, several alternatives were considered. The variations consisted fundamentally in separating elective and urgent surgery into two different models and in categorizing the age variable. These models provided a poorer goodness-of-fit and, therefore, were discarded.
A predictive mortality score was calculated from the final model, following the method described by Sullivan et al. [17]. The relationship between each individual's score and the individualized risk was then plotted graphically.

Results
Of the 92,768 hospitalization episodes evaluated in the TRS, 60.1% (n = 55,735) were male patients and 77.4% were scheduled admissions. All of the analyses and results obtained refer to the period 2008-2014. The patients' mean age was 69.56 ± 11.67 years and the average stay was 12.50 ± 6.74 days, with an average preoperative stay of 2.51 ± 3.96 days. In each case, 3.77 ± 2.94 medical and/or surgical procedures were performed and 5.96 ± 3.37 diagnoses were made at discharge. In total, 16.3% of the episodes were re-admissions and the in-hospital mortality rate was 4.2%, as shown in Table 1. By location, 24.1% of the tumors were in the rectum, 23.5% in the sigmoid colon, 10.8% in the rectosigmoid junction, and the remainder in the colon. Laparotomy was performed in 97.22% of emergency surgeries vs. 88.76% of scheduled surgeries, with statistically significant differences (p < 0.001). The rate of laparotomies, thus, was higher in emergency surgery.
Bivariate analysis revealed significant differences according to age, which was positively associated with in-hospital mortality (ORu 1.08, 95% CI 1.08-1.09). A higher ORu of mortality in the bivariate analysis was very clearly associated with male sex ( Table 2. Logistic regressions, developed sequentially from the initial model to the final model, as shown in Table 3 and Figure 1, presented a similar discriminative capacity. A comparison of the ROC curves of the intermediate and final models revealed a decrease of 0.007 points in the AUC (p < 0.001). In the final model, the variables found to be associated with mortality included age, increased 10-year risk, (OR 1.79, CI 1.78-1.79), urgent admission (OR 4.68, CI 4.36-5.02), COPD (OR 1.43, CI 1.28-1.60), stroke (OR 1.87, CI 1.53-2.29), and renal insufficiency (OR 7.26, CI 6.65-7.94). The AUC was 0.83 (CI 0.82-0.83).     The calibration of the three models was evaluated using Pearson's χ 2 goodness-of-fit test. All of the models were statistically significant (p < 0.001).
For the internal validation, the final model was applied to the TES cohort (AUC = 0.82). There were no statistically significant differences between this result and that obtained with the TRS cohort (p = 0.891), as shown in Figure 2.
The possible scores ranged from 1 to 21 points. The episode with the highest score in the sample produced a score of 17 points. The score-risk correlation, as shown in Table 4, is illustrated in Figure 3.
The calibration of the three models was evaluated using Pearson´s χ 2 goodness-of-fit test. All of the models were statistically significant (p < 0.001).
For the internal validation, the final model was applied to the TES cohort (AUC = 0.82). There were no statistically significant differences between this result and that obtained with the TRS cohort (p = 0.891), as shown in Figure 2. The possible scores ranged from 1 to 21 points. The episode with the highest score in the sample produced a score of 17 points. The score-risk correlation, as shown in Table 4, is illustrated in Figure  3.

Findings
Our third, or "final model", is the first predictive model to be developed in Spain exclusively from clinical administrative databases as a means of stimating the risk of postoperative mortality in

Findings
Our third, or "final model", is the first predictive model to be developed in Spain exclusively from clinical administrative databases as a means of stimating the risk of postoperative mortality in patients admitted with CRC. To our knowledge, this is the first model of its type using this source of information. The model is constructed using only the variables present at the time of admission. The results obtained show that this risk estimation instrument provides high discriminative capacity from parameters that are easily and quickly determined.

Comparison with Previous Studies
Many models have been proposed for the context of general abdominal surgery, and others (although fewer) specifically for CRC surgery. One such is the POSSUM score [4], which has been widely used to predict morbidity and mortality in a variety of surgical processes, in addition to being a useful tool for comparison purposes, by adjusted risk [18]. Variations of this model have appeared, such as Portsmouth-POSSUM (P-POSSUM), aimed at correcting the overestimation of mortality risk, which is very common in low-risk patients. P-POSSUM uses the same twelve physiological and six surgical factors to predict postoperative mortality, but applies a different methodology [19]. Both models are applicable to any surgical patient, unlike the one we propose, which was developed specifically for application in patients with CRC.
To adapt these scales for a separate application to different medical specialities, further modifications have since appeared, such as P-POSSUM, Cr-POSSUM, O-POSSUM, and E-PASS [20][21][22][23]. In this context, the Cr-POSSUM model is of particular interest [24].
The Association of Coloproctology of Great Britain and Ireland (ACPGBI) created a system to assess the risk to patients scheduled for CRC surgery. The prognostic mortality rate thus created was validated by Tekkis in a prospective, multicentre study of patients who underwent surgery for CRC [22]. Age, ASA anaesthetic risk, tumor stage, and type of intervention (urgent/scheduled and complete/incomplete) were shown to be independent prognostic factors. Unlike the CSMS, the calculation of the Tekkis risk scale requires the inclusion of pathological and tumor staging variables.
The CR-POSSUM model also considers physiological and surgical variables, but reduces the number of variables required [24], while maintaining the duality of scores (physiological and operative).
In the United States of America, the American College of Surgeons (ACS-NSQIP) model provides accurate estimates and is very useful both for the patient and for the surgeon. It is a relatively extensive model, with some analytical and tumor-related variables and others that are subjective. Overall, the model performs well. It includes variables, such as anaesthetic risk and other analytical variables (albumin, creatinine), together with some relating to the extent of the tumor and indications for surgery. This means that it cannot predict mortality until more complete information is available. In addition, the estimated mortality is established at 30 days after discharge and not during hospitalization. Despite these limitations, the model is acceptable, providing a good level of performance, and could usefully be applied to complement the method we propose.
More recently, European models, such as that of the French Surgery Association (AFC), have appeared. Using the AFC model [25], a study was conducted of 1049 patients undergoing CRC surgery. The following were included as independent factors, assumed to be directly associated with mortality: urgent surgery, loss of > 10% body weight in the last six months, neurological history, and age over 70 years. The authors proposed a simple prognostic mortality model in CRC patients; this model produced a mortality score in the form of points, but the risk was not individualized. This model bears the greatest similarity to the CSMS, since with only four variables it obtains an acceptable estimate of mortality. Nevertheless, it also presents significant drawbacks, including the small size of the sample, the fact that surgery was elective in over 83% of cases and the use of a 70-year cut-off point for the age criterion (thus, the increased risk per additional year of age was not determined).
A study conducted in the Netherlands proposed a mortality prediction model termed the Identification of Risks in Colorectal Surgery (IRCS) [26], which was externally validated in a cohort of Spanish patients. After validation, the model was compared with CR-POSSUM, and was found to achieve a higher predictive power, according to the ROC curve analysis (0.83 vs. 0.76). The POSSUM model, and others derived from it, has been recalibrated in several studies to obtain new logistic coefficients providing more accurate estimates of risk, as has previously been done concerning other non-surgical pathologies [10,27].
In 2015, Kong et al. [28] published a predictive model of in-hospital mortality in patients undergoing colorectal surgery, termed the Colorectal Preoperative Surgical Score (CrOSS). This model was created and validated externally in Australia, and although it also needs to be validated in other contexts, the initial analysis revealed a ROC value of 0.87. It has the advantage of considering only four variables, namely, age, urgent intervention, albumin, and heart failure. In the same year, Walker et al. [29] presented another model (C-statistic 0.80) but the estimates referred to the 90 days after discharge and so the risk estimation was not immediate.
In Spain, the CCR-CARESS study [10] validated and recalibrated the logistic coefficients of several pre-existing models (CR-POSSUM, POSSUM, AFC and ICRS) by reference to a multicentre cohort in 22 Spanish hospitals. This recalibration slightly improved the discriminative capacity of the CR-POSSUM model (from 0.73 to 0.75) and that of the POSSUM model, which rose to 0.77.
Another study [3] used the same cohort to develop a mortality score, and concluded that advanced age (over 80 years), palliative surgery, and chronic obstructive pulmonary disease (COPD) were the factors most strongly associated with mortality. The CCR-CARESS score was first applied to 60% of the sample, and was then validated for the remaining 40%. The three variables cited above were used to create a range of 0 to 5 points, which, in turn, was associated with a given risk of mortality at 30 days. This score is straightforward to apply and has good discriminative capacity, although, unlike the CSMS, it does not evaluate hospital mortality or obtain individualized estimates of risk, referring instead to severity groups.
Thus, almost all the predictive models presented to date take into account the surgery performed and the operative or postoperative variables. In consequence, the risk estimates can only be obtained after the surgery has been performed. Other models provide estimates at 30-90 days or even one to two years after surgery, but not in the immediate postoperative period [30]. The SCMS model (the third or final model) is an auxiliary instrument designed to estimate the risk of mortality in the hospital and before colorectal surgery, together with clinical criteria and other traditional scores. It is intended to assist in decision making and scheduling for patients who need this type of surgery.

Strengths of the Study
In comparison with previously published models, the major contribution of the paper we present is the description of what may be the most radically simplified model yet proposed. It is especially significant that this model is based on the analysis of a very large registry population. Patient age, type of admission, history of COPD, renal insufficiency, or stroke are the independent variables used to predict the risk of mortality before surgery is undertaken. The age variable is present in all of the models previously reported. However, as our model does not take into account the type of surgery performed, but rather the type of admission, the consideration of this variable distinguishes our model from the alternatives. The history of COPD resembles the variable described in other models, such as that of Sluis et al. [26], which includes respiratory failure. Our index includes patient history of stroke as the fourth variable to be considered, and this comorbidity has previously been suggested as a factor that may be associated with mortality [31,32].
In view of these considerations, we believe that the proposed model has adequate discriminative capacity and provides good visual calibration of the deciles of risk. To our knowledge, with one exception, the Spanish population has only been used in the external validation of models for other nationalities, not in the creation of scores. The exception is Quintana et al. [30], but their estimates only considered mortality at one and two years. Our model enables the pre-surgery risk to be estimated and does not require the consideration of complex variables. This model facilitates the provision of more personalised medicine and surgery by estimating the risk faced by each patient individually upon admission.
Finally, although the risk score is derived from the logistic model, certain explanations should be provided. Firstly, the conversion of this score, as shown in Table 4, into risk ranges by quartiles provides estimates with sharp variations. For this reason, the results were softened and the abrupt changes produced by categorization were attenuated, by means of a curve showing the points generated in the score, with the individualized risk for each value, as shown in Figure 3. This alternative can be obtained rapidly and avoids the need for complex calculations. On the other hand, the consideration of clinical variables and those related to patient fragility seems to be more strongly related to mortality in the case of in-hospital mortality. Variables referring to a later stage, such as those related to surgery or the condition of the tumor, have less weight in the model, thus facilitating the very early provision of risk estimates.

Potential Limitations
One of the main difficulties of the proposed model is the question of the relevance of the HL test [13], which is commonly used to evaluate the model calibration. This test is based on a Chi-square test and is therefore affected by elevated sample sizes [33]. In this regard, we preferred Pearson's χ 2 goodness-of-fit test, applied to ungrouped data, in view of its possible greater power. Nevertheless, statistically significant results were observed [14].
Another limitation that should be taken into account is the use of hospitalization episodes, rather than individual patients. As a result of this, overfitting might be present. This factor might also account for the low contribution of the gender variable, which was ultimately removed from the model.
A number of important surgical variables (ileus, deshicences, etc.) were not included due to a lack of significance in the elaboration of the third or final model, and this might seem a striking limitation. On the one hand, this absence of variables in the multivariate model is, in fact, a strength, since it allows us to estimate patient risk at the time of admission. However, the exclusion of these factors must also be viewed as a potential limitation, in the sense that they might be relevant to mortality during the first surgical hospitalization. Nevertheless, on balance, we believe that, if the patients had been followed up for three or more months after admission, any such impact from this source would have been observed.
Finally, we must acknowledge the existence of under-registration, a bias that could provoke the appearance of paradoxical and sometimes falsely protective results. This type of bias has been extensively studied by Jencks et al. [34] and research has confirmed that it frequently affects the Spanish MBDS [35]. In the current study, conditions, such as hyperlipidaemia, hypertension, and obesity, present this type of effect. Another aspect related to the characteristics of this type of medical record is the fact that it is very difficult to differentiate an intrahospital complication from a previous comorbidity.
However, our final model does, in fact, isolate the variables contributed by the patient at the time of admission.

Implications
This paper describes a new model for estimating the risk of in-hospital mortality. The results obtained raise important considerations regarding disease prognosis and management. A better understanding of individualized risk will allow treatment programs to be adapted accordingly and diagnostic-therapeutic tests streamlined to determine this risk. The model has important implications for improving the quality of health care and may have a significant impact on what has been termed "personalised medicine". The variables addressed can be obtained in the first few minutes of patient care. Forthcoming studies of recalibration and external validation will ensure the absence of overfitting and will underpin the reliability of this approach to the reality of hospital mortality from CRC.
Finally, it should be stressed that the predictive model proposed is an auxiliary tool that can (and we believe should) be used in conjunction with other clinical-surgical parameters and even with other previous scales. The model is not intended to replace, but rather to complement the consideration of clinical-surgical criteria in the risk assessment process.

Conclusions
Our study shows that it is possible to create a logistic model and a scoring system to estimate the risk of death of patients undergoing surgery for colorectal cancer. The model obtained is built on variables that place more emphasis on the frailty of the patient than on the intraoperative variables; it also has the advantage of using variables obtained at an early stage in the clinical assessment.
Finally, we reiterate the importance of the role played by clinical variables and comorbidities in predicting the mortality that occurs during the hospitalization of these patients.