Case-mix affects calibration of cardiosurgical severity scores

Anna Zamperoni; Carlotta Rossi; Stefano Finazzi; Paolo Del Sarto; Matteo Mondini; Giovanni Nattino; Daniele Poole; Guido Bertolini; Cardiac surgical intensive care writing committee (GiViTI)

doi:10.23736/S0375-9393.20.14280-9

Case-mix affects calibration of cardiosurgical severity scores

Minerva Anestesiol. 2020 Jul;86(7):719-726. doi: 10.23736/S0375-9393.20.14280-9. Epub 2020 Mar 6.

Authors

Anna Zamperoni¹, Carlotta Rossi², Stefano Finazzi³, Paolo Del Sarto⁴, Matteo Mondini², Giovanni Nattino⁵, Daniele Poole⁶, Guido Bertolini²; Cardiac surgical intensive care writing committee (GiViTI)

Collaborators

Cardiac surgical intensive care writing committee (GiViTI):
Roberto Agostinelli, Andrea Balata, Giuseppe Buscaglia, Mauro A Calo', Graziano Cortis, Massimiliano Greco, Matteo Lucchelli, Ricardo M Escobar, Marco Maurelli, Carolina Monaco, Sandra Nonini, Alessandro Rech, Gianluigi Redaelli, Alberto Seno, Andrea Ardenega

Affiliations

¹ Cà Foncello Hospital, Aulss2, Treviso, Italy.
² IRCCS Mario Negri Institute for Pharmacological Research, Villa Camozzi, Ranica, Bergamo, Italy.
³ IRCCS Mario Negri Institute for Pharmacological Research, Villa Camozzi, Ranica, Bergamo, Italy - stefano.finazzi@marionegri.it.
⁴ Department of Critical Care, Fondazione Toscana G. Monasterio, G. Pasquinucci Heart Hospital, Massa, Italy.
⁵ Division of Biostatistics, College of Public Health, The Ohio State University, Columbus, OH, USA.
⁶ Anesthesia and Intensive Care Operative Unit, San Martino Hospital, Belluno, Italy.

PMID: 32154682
DOI: 10.23736/S0375-9393.20.14280-9

Abstract

Background: Prognostic models are often used to assess the quality of healthcare. Several scores were developed to predict mortality after cardiac surgery, but none has reached optimal performance in subsequent validations. We validate the most used scores (EUROSCORE I and II, STS, and ACEF) on a cohort of cardiac-surgery patients, assessing their robustness against case-mix changes.

Methods: The scores were validated on 14,559 patients admitted to 16 Italian cardiosurgical ICUs participating to Margherita-Prosafe project in 2014 and 2015. Calibration was assessed through Hosmer-Lemeshow Test, standardized mortality ratio, and GiViTI calibration test and belt. Discrimination was measured by the area under the ROC curve.

Results: The study included 10,317 patients who were eligible to the calculation of the STS Score (4156 isolated valve, 4681 isolated CABG and 1480 single valve and CABG) which calibrated well in these subgroups. The ACEF Score and EUROSCORE I and II were available for 14,139, and 14,071 patients, respectively. EUROSCORE I significantly overestimated mortality; EUROSCORE II calibrated well overall, but underestimated mortality of patients undergoing complex surgery and non-elective ones. The ACEF Score calibrated poorly in elective and non-elective patients. Discrimination was acceptable for all models (AUC>0.70), but not for the ACEF Score.

Conclusions: Cardiac surgery scores calibrate poorly when the case-mix of validation and development samples differs. To grant reliability for benchmarking, they should be validated in the clinical settings on which they are applied and updated periodically. Advanced statistical tools are essential for the correct interpretation and application of severity scores.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Calibration*
Hospital Mortality
Humans
Reproducibility of Results
Retrospective Studies
Risk Assessment