Enhancing Suicide Attempt Risk Prediction Models with Temporal Clinical Note Features

Kevin J Krause; Sharon E Davis; Zhijun Yin; Katherine M Schafer; Samuel Trent Rosenbloom; Colin G Walsh

doi:10.1055/a-2411-5796

Enhancing Suicide Attempt Risk Prediction Models with Temporal Clinical Note Features

Appl Clin Inform. 2024 Oct;15(5):1107-1120. doi: 10.1055/a-2411-5796. Epub 2024 Sep 9.

Authors

Kevin J Krause¹, Sharon E Davis¹, Zhijun Yin¹, Katherine M Schafer¹, Samuel Trent Rosenbloom¹, Colin G Walsh¹

Affiliation

¹ Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, United States.

PMID: 39251213
PMCID: PMC11655152 (available on 2025-12-18)
DOI: 10.1055/a-2411-5796

Abstract

Objectives: The objective of this study was to investigate the impact of enhancing a structured-data-based suicide attempt risk prediction model with temporal Concept Unique Identifiers (CUIs) derived from clinical notes. We aimed to examine how different temporal schemes, model types, and prediction ranges influenced the model's predictive performance. This research sought to improve our understanding of how the integration of temporal information and clinical variable transformation could enhance model predictions.

Methods: We identified modeling targets using diagnostic codes for suicide attempts within 30, 90, or 365 days following a temporally grouped visit cluster. Structured data included medications, diagnoses, procedures, and demographics, whereas unstructured data consisted of terms extracted with regular expressions from clinical notes. We compared models trained only on structured data (controls) to hybrid models trained on both structured and unstructured data. We used two temporalization schemes for clinical notes: fixed 90-day windows and flexible epochs. We trained and assessed random forests and hybrid long short-term memory (LSTM) neural networks using area under the precision recall curve (AUPRC) and area under the receiver operating characteristic, with additional evaluation of sensitivity and positive predictive value at 95% specificity.

Results: The training set included 2,364,183 visit clusters with 2,009 30-day suicide attempts, and the testing set contained 471,936 visit clusters with 480 suicide attempts. Models trained with temporal CUIs outperformed those trained with only structured data. The window-temporalized LSTM model achieved the highest AUPRC (0.056 ± 0.013) for the 30-day prediction range. Hybrid models generally showed better performance compared with controls across most metrics.

Conclusion: This study demonstrated that incorporating electronic health record-derived clinical note features enhanced suicide attempt risk prediction models, particularly with window-temporalized LSTM models. Our results underscored the critical value of unstructured data in suicidality prediction, aligning with previous findings. Future research should focus on integrating more sophisticated methods to continue improving prediction accuracy, which will enhance the effectiveness of future intervention.

MeSH terms

Female
Humans
Male
Neural Networks, Computer
Risk Assessment / methods
Suicide, Attempted* / statistics & numerical data
Time Factors

Grants and funding

Funding This research has been supported by several funding bodies. The primary source of funding was the National Library of Medicine (NLM) T15 training grant (grant number: 2T15LM007450-20). Additional support came from the Evelyn Selby Stead Fund for Innovation, Vanderbilt University Medical Center, specifically grants R01 MH121455 and R01 MH116269. The Military Suicide Research Consortium also provided funding through grant W81XWH-10-2-0181. Finally, funding for the Research Derivative and BioVU Synthetic Derivative was provided by the National Center for Research Resources (grant number: UL1 RR024975/RR/NCRR). The funders had no role in study design, data collection and analysis, or manuscript preparation.