Background: Ischemic stroke is a critical neurological condition, with infection representing a significant aspect of its clinical management. Sepsis, a life-threatening organ dysfunction resulting from infection, is among the most dangerous complications in the intensive care unit (ICU). Currently, no model exists to predict the onset of sepsis in ischemic stroke patients. This study aimed to develop the first predictive model for sepsis in ischemic stroke patients using data from the MIMIC-IV database, leveraging machine learning techniques.
Methods: A total of 2238 adult patients with a diagnosis of ischemic stroke, admitted to the ICU for the first time, were included from the MIMIC-IV database. The outcome of interest was the development of sepsis. Model development adhered to the TRIPOD guidelines. Feature selection was performed using Least Absolute Shrinkage and Selection Operator (LASSO) regression, identifying 28 key variables. Multiple machine learning algorithms, including logistic regression, k-nearest neighbors, support vector machines, decision trees, and XGBoost, were trained and internally validated. Performance metrics were assessed, and XGBoost was selected as the optimal model. The SHAP method was used to interpret the XGBoost model, revealing the impact of individual features on predictions. The model was also deployed on a user-friendly platform for practical use in clinical settings.
Results: The XGBoost model demonstrated superior performance in the validation set, achieving an area under the curve (AUC) of 0.863 and offering greater net benefit compared to other models. SHAP analysis identified key factors influencing sepsis risk, including the use of invasive mechanical ventilation on the first day, excessive body weight, a Glasgow Coma Scale verbal score below 3, age, and elevated body temperature (>37.5 °C). A user interface had been developed to enable clinicians to easily access and utilize the model.
Conclusions: This study developed the first machine learning-based model to predict sepsis in ischemic stroke patients. The model exhibited high accuracy and holds potential as a clinical decision support tool, enabling earlier identification of high-risk patients and facilitating preventive measures to reduce sepsis incidence and mortality in this population.
Keywords: XGBoost; ischemic stroke; machine learning; prediction model; sepsis.