Objective: Survival probability predictions in critically ill patients are mainly used to measure the efficacy of intensive care unit (ICU) treatment. The available models are functions induced from data on thousands of patients. Eventually, some of the variables used for these purposes are not part of the clinical routine, and may not be registered in some patients. In this paper, we propose a new method to build scoring functions able to make reliable predictions, though functions whose induction only requires records from a small set of patients described by a few variables.
Methods: We present a learning method based on the use of support vector machines (SVM), and a detailed study of its prediction performance, in different contexts, of groups of variables defined according to the source of information: monitoring devices, laboratory findings, and demographic and diagnostic features.
Results: We employed a data set collected in general ICUs at 10 units of hospitals in Spain, 6 of which include coronary patients, while the other 4 do not treat coronary diseases. The total number of patients considered in our study was 2501, 19.83% of whom did not survive. Using these data, we report a comparison between the SVM method proposed here with other approaches based on logistic regression (LR), including a second-level recalibration of release III of the acute physiology and chronic health evaluation (APACHE, a scoring system commonly used in ICUs) induced from the available data. The SVM method significantly outperforms them all from a statistical point of view. Comparison with the commercial version of APACHE III shows that the SVM scores are slightly better when working with data sets of more than 500 patients.
Conclusions: From a practical point of view, the implications of the research reported here may be helpful to address the construction of cheap and reliable prediction systems in accordance with the peculiarities of ICUs and kinds of patients.