Deep survival analysis for interpretable time-varying prediction of preeclampsia risk

J Biomed Inform. 2024 Aug:156:104688. doi: 10.1016/j.jbi.2024.104688. Epub 2024 Jul 11.

Abstract

Objective: Survival analysis is widely utilized in healthcare to predict the timing of disease onset. Traditional methods of survival analysis are usually based on Cox Proportional Hazards model and assume proportional risk for all subjects. However, this assumption is rarely true for most diseases, as the underlying factors have complex, non-linear, and time-varying relationships. This concern is especially relevant for pregnancy, where the risk for pregnancy-related complications, such as preeclampsia, varies across gestation. Recently, deep learning survival models have shown promise in addressing the limitations of classical models, as the novel models allow for non-proportional risk handling, capturing nonlinear relationships, and navigating complex temporal dynamics.

Methods: We present a methodology to model the temporal risk of preeclampsia during pregnancy and investigate the associated clinical risk factors. We utilized a retrospective dataset including 66,425 pregnant individuals who delivered in two tertiary care centers from 2015 to 2023. We modeled the preeclampsia risk by modifying DeepHit, a deep survival model, which leverages neural network architecture to capture time-varying relationships between covariates in pregnancy. We applied time series k-means clustering to DeepHit's normalized output and investigated interpretability using Shapley values.

Results: We demonstrate that DeepHit can effectively handle high-dimensional data and evolving risk hazards over time with performance similar to the Cox Proportional Hazards model, achieving an area under the curve (AUC) of 0.78 for both models. The deep survival model outperformed traditional methodology by identifying time-varied risk trajectories for preeclampsia, providing insights for early and individualized intervention. K-means clustering resulted in patients delineating into low-risk, early-onset, and late-onset preeclampsia groups-notably, each of those has distinct risk factors.

Conclusion: This work demonstrates a novel application of deep survival analysis in time-varying prediction of preeclampsia risk. Our results highlight the advantage of deep survival models compared to Cox Proportional Hazards models in providing personalized risk trajectory and demonstrating the potential of deep survival models to generate interpretable and meaningful clinical applications in medicine.

Keywords: Cox Proportional hazards model; Deep learning; K-means clustering; Neural networks; Preeclampsia; Survival analysis.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, N.I.H., Extramural

MeSH terms

  • Adult
  • Deep Learning
  • Female
  • Humans
  • Neural Networks, Computer
  • Pre-Eclampsia* / mortality
  • Pregnancy
  • Proportional Hazards Models
  • Retrospective Studies
  • Risk Assessment / methods
  • Risk Factors
  • Survival Analysis