Harnessing repeated measurements of predictor variables for clinical risk prediction: a review of existing methods

Lucy M Bull; Mark Lunt; Glen P Martin; Kimme Hyrich; Jamie C Sergeant

doi:10.1186/s41512-020-00078-z

Harnessing repeated measurements of predictor variables for clinical risk prediction: a review of existing methods

Diagn Progn Res. 2020 Jul 9:4:9. doi: 10.1186/s41512-020-00078-z. eCollection 2020.

Authors

Lucy M Bull^{1

2}, Mark Lunt¹, Glen P Martin³, Kimme Hyrich^{1

4}, Jamie C Sergeant^{1

2}

Affiliations

¹ Centre for Epidemiology Versus Arthritis, Centre for Musculoskeletal Research, Manchester Academic Health Science Centre, University of Manchester, Manchester, UK.
² Centre for Biostatistics, Manchester Academic Health Science Centre, University of Manchester, Manchester, UK.
³ Division of Informatics, Imaging and Data Science, Faculty of Biology, Medicine and Health, University of Manchester, Manchester Academic Health Science Centre, Manchester, UK.
⁴ National Institute for Health Research Manchester Biomedical Research Centre, Manchester University NHS Foundation Trust, Manchester Academic Health Science Centre, Manchester, UK.

Abstract

Background: Clinical prediction models (CPMs) predict the risk of health outcomes for individual patients. The majority of existing CPMs only harness cross-sectional patient information. Incorporating repeated measurements, such as those stored in electronic health records, into CPMs may provide an opportunity to enhance their performance. However, the number and complexity of methodological approaches available could make it difficult for researchers to explore this opportunity. Our objective was to review the literature and summarise existing approaches for harnessing repeated measurements of predictor variables in CPMs, primarily to make this field more accessible for applied researchers.

Methods: MEDLINE, Embase and Web of Science were searched for articles reporting the development of a multivariable CPM for individual-level prediction of future binary or time-to-event outcomes and modelling repeated measurements of at least one predictor. Information was extracted on the following: the methodology used, its specific aim, reported advantages and limitations, and software available to apply the method.

Results: The search revealed 217 relevant articles. Seven methodological frameworks were identified: time-dependent covariate modelling, generalised estimating equations, landmark analysis, two-stage modelling, joint-modelling, trajectory classification and machine learning. Each of these frameworks satisfies at least one of three aims: to better represent the predictor-outcome relationship over time, to infer a covariate value at a pre-specified time and to account for the effect of covariate change.

Conclusions: The applicability of identified methods depends on the motivation for including longitudinal information and the method's compatibility with the clinical context and available patient data, for both model development and risk estimation in practice.

Keywords: Clinical risk prediction; Dynamic prediction; Electronic health records; Joint models; Longitudinal data; Personalised medicine; Prediction models; Repeated observations; Survival analysis; Time-dependent covariates.

Publication types

Review

Grants and funding

DRF-2018-11-ST2-052/DH_/Department of Health/United Kingdom