Recurrent Neural Networks for Early Detection of Heart Failure From Longitudinal Electronic Health Record Data: Implications for Temporal Modeling With Respect to Time Before Diagnosis, Data Density, Data Quantity, and Data Type

Robert Chen; Walter F Stewart; Jimeng Sun; Kenney Ng; Xiaowei Yan

doi:10.1161/CIRCOUTCOMES.118.005114

Recurrent Neural Networks for Early Detection of Heart Failure From Longitudinal Electronic Health Record Data: Implications for Temporal Modeling With Respect to Time Before Diagnosis, Data Density, Data Quantity, and Data Type

Circ Cardiovasc Qual Outcomes. 2019 Oct;12(10):e005114. doi: 10.1161/CIRCOUTCOMES.118.005114. Epub 2019 Oct 15.

Authors

Robert Chen^{1

2}, Walter F Stewart³, Jimeng Sun², Kenney Ng⁴, Xiaowei Yan¹

Affiliations

¹ Research, Sutter Health Research, Walnut Creek, CA (R.C., X.Y.).
² School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, (R.C., J.S.).
³ Step2Works, Orinda, CA (W.F.S.).
⁴ Center for Computational Health, IBM Research, T.J. Watson Research Center, Yorktown Heights, NY (K.N.).

Abstract

Background: We determined the impact of data volume and diversity and training conditions on recurrent neural network methods compared with traditional machine learning methods.

Methods and results: Using longitudinal electronic health record data, we assessed the relative performance of machine learning models trained to detect a future diagnosis of heart failure in primary care patients. Model performance was assessed in relation to data parameters defined by the combination of different data domains (data diversity), the number of patient records in the training data set (data quantity), the number of encounters per patient (data density), the prediction window length, and the observation window length (ie, the time period before the prediction window that is the source of features for prediction). Data on 4370 incident heart failure cases and 30 132 group-matched controls were used. Recurrent neural network model performance was superior under a variety of conditions that included (1) when data were less diverse (eg, a single data domain like medication or vital signs) given the same training size; (2) as data quantity increased; (3) as density increased; (4) as the observation window length increased; and (5) as the prediction window length decreased. When all data domains were used, the performance of recurrent neural network models increased in relation to the quantity of data used (ie, up to 100% of the data). When data are sparse (ie, fewer features or low dimension), model performance is lower, but a much smaller training set size is required to achieve optimal performance compared with conditions where data are more diverse and includes more features.

Conclusions: Recurrent neural networks are effective for predicting a future diagnosis of heart failure given sufficient training set size. Model performance appears to continue to improve in direct relation to training set size.

Keywords: diagnosis; electronic health records; heart failure; machine learning; mortality.

Publication types

Comparative Study
Research Support, N.I.H., Extramural
Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

Alcohol Drinking / adverse effects
Alcohol Drinking / ethnology
California / epidemiology
Diagnosis, Computer-Assisted*
Early Diagnosis
Electronic Health Records*
Female
Heart Failure / diagnosis*
Heart Failure / ethnology
Heart Failure / physiopathology
Humans
Incidence
Longitudinal Studies
Machine Learning*
Male
Neural Networks, Computer*
Predictive Value of Tests
Primary Health Care
Reproducibility of Results
Risk Factors
Smoking / adverse effects
Smoking / ethnology
Time Factors
Vital Signs*

Abstract

Publication types

MeSH terms

Grants and funding