Benchmark of Intraoperative Activity in Cardiac Surgery: A Comparison between Pre- and Post-Operative Prognostic Models

Objectives: Despite its large diffusion and improvements in safety, the risks of complications after cardiac surgery remain high. Published predictive perioperative scores (EUROSCORE, STS, ACEF) assess risk on preoperative data only, not accounting for the intraopertive period. We propose a double-fold model, including data collected before surgery and data collected at the end of surgery, to evaluate patient risk evolution over time and assess the direct contribution of surgery. Methods: A total of 15,882 cardiac surgery patients from a Margherita-Prosafe cohort study were included in the analysis. Probability of death was estimated using two logistic regression models (preoperative data only vs. post-operative data, also including information at discharge from the operatory theatre), testing calibration and discrimination of each model. Results: Pre-operative and post-operative models were built and demonstrate good discrimination and calibration with AUC = 0.81 and 0.87, respectively. Relative difference in pre- and post-operative mortality in separate centers ranged from −0.36 (95% CI: −0.44–−0.28) to 0.58 (95% CI: 0.46–0.71). The usefulness of this two-fold preoperative model to benchmark medical care in single hospital is exemplified in four cases. Conclusions: Predicted post-operative mortality differs from predicted pre-operative mortality, and the distance between the two models represent the impact of surgery on patient outcomes. A double-fold model can assess the impact of the intra-operative team and the evolution of patient risk over time, and benchmark different hospitals on patients subgroups to promote an improvement in medical care in each center.


Introduction
Due to the ageing of the population, cardiovascular diseases represent a larger proportion of the total health care expenditure. Quality of life and working ability of an individual with chronic cardiac conditions can significantly improve after cardiac surgery [1,2]. This leads to improved patient health, while also reducing social and medical costs in long-term cardiac disease.
Despite its advantages, cardiac surgery is a high-complexity and high-cost surgery [3]. Accordingly, the availability of a predictive tool to reduce potential complications would be useful not only to weigh pre-operative risk, but also to titrate the evolution of the perioperative risk during the early phases of postoperative recovery.
We propose to follow the evolution of patients clinical status over time, developing two prediction models for the probability of in-hospital death in two pivotal moments: before surgery at operatory theatre (OT) admission (pre-operative model) and at discharge from operatory theatre and admission into Intensive Care Units (ICUs) (post-operative model). Accordingly, the difference in predicted mortality between the two models could be referred as a proxy of the effect of the surgical act on patient outcomes.
The objective of this study is to propose the use of these two models as a tool to benchmark the performance of the surgical team among hospitals, and identify possible pitfalls in one of the critical aspects of a complex and expensive perioperative care pathway.

Ethical Statement
The Margherita-Prosafe Project was approved by the Ethical Committee of the coordinating centre, Comitato Etico Regione Liguria approval n. 381REG2015 on 17 September 2015.

Inclusion and Exclusion Criteria
All patients aged over 16 years admitted to cardiosurgical ICUs joining the Italian Group for the Evaluation of Interventions in Intensive Care medicine (GiViTI) in 2016 or 2017 after cardiac surgery were considered eligible for the analysis. In the case of readmissions, only the first admission to the ICU was considered. We excluded patients undergoing surgery or endovascular aortic repair for Type-B aortic dissection and patients admitted to the ICU before cardiac surgery.

Data Collection
Clinical information was collected by means of a software (PROSAFE) developed by the GiViTI Coordination Centre. We considered demographics, comorbidities, clinical conditions and organ failures at ICU admission, relevant details concerning cardiac surgery procedures, and hospital mortality. Additional information, including type of procedures, complications during ICU stay, and ICU mortality, was also collected through PROSAFE, but not included in the prognostic models because we aimed to adjust for patients' features before surgical procedures and at ICU admission.

Data Validity
Data validity was assessed at different stages to avoid selection biases and input error and to guarantee the internal consistency of the records. We excluded all patients admitted within months where more than 10% of admitted patients had incomplete records.

Outcomes
The main outcome was the difference between expected in-hospital deaths computed after and before surgical procedure, for each ICU participating in the project. Differences in mortality were also investigated in subgroups of patients: elective and non-elective surgery and type of surgical procedure: plastic/replacement of aortic valve (AVR), plastic/replacement of mitral valve (MVP and MVR), coronary artery bypass graft (CABG), and ascending aortic surgery.

Statistical Analysis
Categorical variables are reported as frequency and percentage, continuous variables as median and interquartile range (IQR), as appropriate. Variable distributions between alive and dead patients were compared using Wilcoxon-Mann-Whitney test for continuous variables and the chi-squared test for categorical variables.
We estimated the probability of hospital death before and after the cardiac surgical procedure using two logistic regression models. The dataset was randomly split in training and validation sets, containing 85% and 15% of the records, respectively. To develop the former model, we tested only variables available before surgery (e.g., demographics, comorbidities, and type of cardiac surgery). Variables representing rare events (less than 100 patients in the corresponding subgroup or less than 30 events per subgroup) were excluded. The association of each variable with outcome was assessed through bi-variate logistic regression and variables with p-values greater than 0.25 were discarded. We tested the linearity of logit as a function of continuous variables, replacing non-linear relationships with piece-wise linear functions. Forward and backward stepwise selection was adopted to identify variables significantly associated with outcome (p < 0.01). The levels of categorical variables were merged on the basis of clinical reasoning if their odds ratios were not statistically different. To ease the clinical interpretation of the models in the post-operative model, we included all the variables selected in the pre-operative model, regardless of their statistical significance. The lists of variables tested in the forward and backward selection for both models is reported in Supplementary Material File S1.
The calibration of the model was tested overall and in subgroups of patients, as defined by included variables and by clinical relevance, using the GiViTI calibration belt and test [13,14]. Discrimination was investigated by measuring the Area Under the Curve (AUC) in the Receiver Operator Characteristics (ROC) analysis. Calibration and discrimination was assessed both on the training and in the validation set.
Using the expected probabilities computed with the two models, we evaluated for each ICU and for each subgroup of patients the expected number of death pre-surgery (e pre and e post ) and post-surgery. The relative difference in expected mortality was computed as d = (e post − e pre )/e pre to the 95% confidence interval of d was estimated by bootstrap analysis, constructing 1000 pre-surgery and 1000 post-surgery models on 1000 simulated datasets, each one with N records randomly resampled from the original dataset with replacement.

Study Population
Between 2016-2017, we collected data from 23,086 adult patients from 20 centers. Among them, 16,787 patients underwent a cardio-surgical intervention different from the endoprothesis of descending aorta. According to exclusion criteria listed in the Methods section, we excluded patients admitted to the ICU before the interventions and readmissions; thus, we analyzed 15,882 patients ( Figure 1). The models were developed on 15,533 patients with non-missing outcomes, using 13,211 records for training and 2322 for validation.

Patients' Characteristics and Prognostic Models
Patients' demographics, preoperative characteristics, and outcomes are described in Table 1. Figure 2A reports pre-operative (left) and post-operative variables (right) significantly associated with outcome. The odds ratios (OR) of continuous variables, creatinine clearance, and age is plotted in Figure 2B. Patients' features included in the model and not present in Table 1

Patients' Characteristics and Prognostic Models
Patients' demographics, preoperative characteristics, and outcomes are described Table 1. Figure 2A reports pre-operative (left) and post-operative variables (rig significantly associated with outcome. The odds ratios (OR) of continuous variab creatinine clearance, and age is plotted in Figure 2B. Patients' features included in model and not present in Table 1 are described in Supplementary Material S2. The AU is 0.81 and 0.87 on the training set for the pre-and post-operative models, respective The p-value of the GiViTI calibration test is 0.17 and 0.74 on the training set for the p and post-operative models, respectively. On the validation set, the AUC is 0.79 and 0 and the p-value of the calibration test is 0.07 and 0.10, respectively. ROC curves and GiV calibration belts are reported in Supplementary Material S3.

Difference between Pre-and Post-Operative Mortality
The difference in pre-operative and post-operative expected mortality for each center, normalized by pre-operative expected mortality, is reported in Figure 3. The difference between these two predictive models is a proxy of the intraoperative performance for each center.

Difference between Pre-and Post-Operative Mortality
The difference in pre-operative and post-operative expected mortality for each center, normalized by pre-operative expected mortality, is reported in Figure 3. The difference between these two predictive models is a proxy of the intraoperative performance for each center. Centers with lower post-operative mortality are in the left region of the plot, while centers with increased mortality after surgery are toward the right of the plot. Centers 9, 10, and 11 demonstrate similar pre-and post-operative mortality.

Subgroup and Centre-Specific Analysis
Further analyses were performed to identify possible sub-groups of patients influencing perioperative mortality within each center.
For practical reasons, we present data from four centers (center identifiers 1-6-10-19) to illustrate the findings of our double-fold pre-and post-operative model. Figure 4A shows the subgroup analysis for center 19, where the patient probability of death is consistently reduced after surgery in the overall population and for different subgroups of patients. Figure 4B reports the performance of center 1, where the probability of death increases after the surgical act in all subgroups of patients. Centers with lower post-operative mortality are in the left region of the plot, while centers with increased mortality after surgery are toward the right of the plot. Centers 9, 10, and 11 demonstrate similar pre-and post-operative mortality.

Subgroup and Centre-Specific Analysis
Further analyses were performed to identify possible sub-groups of patients influencing perioperative mortality within each center.
For practical reasons, we present data from four centers (center identifiers 1-6-10-19) to illustrate the findings of our double-fold pre-and post-operative model. Figure 4A shows the subgroup analysis for center 19, where the patient probability of death is consistently reduced after surgery in the overall population and for different subgroups of patients. Figure 4B reports the performance of center 1, where the probability of death increases after the surgical act in all subgroups of patients. Center 10 has similar pre-and post-operative mortality when considering the overall model ( Figure 3). However, when considering sub-analyses according to types of intervention, mitral valve surgery (MVS) increases post-operative mortality compared to pre-operative prediction ( Figure 4C). When analyzing different types of intervention (MVP vs. MVR) separately, the effect on mortality was related with MVR only, as MVP patients reach good postoperative outcomes. Indeed, MVR had 0.34 relative mortality Center 10 has similar pre-and post-operative mortality when considering the overall model (Figure 3). However, when considering sub-analyses according to types of intervention, mitral valve surgery (MVS) increases post-operative mortality compared to pre-operative prediction ( Figure 4C). When analyzing different types of intervention (MVP vs. MVR) separately, the effect on mortality was related with MVR only, as MVP patients reach good postoperative outcomes. Indeed, MVR had 0.34 relative mortality reduction between pre-and post-operative models (95% CI 0.22-0.41, 133 patients; 12.6 and 16.7 expected deaths with the pre-and post-operative models, respectively). MVP had −0.13 relative difference in mortality between pre-and post-operative models (95% CI −0.28, −0.015, 131 patients; 4.2 and 3.7 expected deaths in the pre-and post-operative models, respectively). Center 10 is characterized by a prominent surgeon expert in mitral valve procedures. When calculating mortality according to procedure and to lead surgeon, the most experienced surgeon reduced mortality in MVP, while slightly increasing mortality in MVR cases. Indeed, the pre-and post-operative differences for MVR mortality were 0.

Discussion
Our results show that the predicted post-operative mortality differs from the predicted pre-operative mortality. The distance between the two probabilities expresses the impact of surgery on patient outcomes, and may be used to benchmark intra-operative performances.
Surgery is the sum of both organizational factors (hospital services including blood bank and ancillary services for operatory block) and competences from several professionals, such as cardiac surgeons, cardiac anesthesiologist, nurses, perfusionists, and healthcare assistants. These aspects are closely linked to the level of resources, and while it is difficult to unbundle the role of each team component, it is clear that the first surgical operator has a major role in overall surgical performance [15,16].
The new tool proposed in this work allows to identify significant differences between centers from our cohort either in the overall population or in subgroup analysis, as exemplified in Figure 3.
When the (either good or bad) performance of a centre is consistent in all patient groups, one may argue that protective or detrimental factors are spread throughout the whole care pathway.
Subgroup analyses may detect specific areas of excellence or critical issues in centers where the overall mortality may appear within a range of normality. This is the case of center 10 ( Figure 4C), where our models were able to identify differences in the performance when stratifying by either the type of intervention or by the surgical team. Specifically, patients undergoing a certain surgical intervention had a lower mortality when treated by a team including a surgeon experienced in that cardiac surgical technique. This example suggests, as expected, that it is reasonable to attribute to the lead surgeon and to his/her team a significant portion of the responsibility of overall intraoperative performance, and that the development of ultra-specialistic technique may represent both an intellectual investment and a productive resource, first and foremost in health terms, for the medical service.

Limitations
Intraoperative deaths were excluded from the analysis, as they lack post-operative data. However, intraoperative mortality in our population was low, in accordance with the published literature for elective, urgent and emergent cardiac surgery cases.
In our model, we considered only data available at admission and discharge from the operatory theatre. Other studies considered pre-hospital variables, including schooling, economic status and social and family status (being single, socially isolated or without caregiver), which may have a weight in prognostic scores [17,18].
Moreover, we acknowledge that our system for data collection represents a significant workload for each center, given the large amount of collected information and the number of internal quality checks. On the one hand, this may hinder participation, while on the other, it ensures data quality.

Conclusions
We built a double-fold predictive tool that is useful to evaluate hospital performance at different moments of the cardiac surgery perioperative pathway, thus allowing to analyze the contribution of the intraoperative performance. Our model allows different subgroups of patients and distinct types of surgery to be analyzed, promoting improvement through the recognition of possible quality barriers and through the identification and reinforcement of good and exportable clinical practices.