Identifying Patients With High Data Completeness to Improve Validity of Comparative Effectiveness Research in Electronic Health Records Data

Clin Pharmacol Ther. 2018 May;103(5):899-905. doi: 10.1002/cpt.861. Epub 2017 Oct 10.

Abstract

Electronic health record (EHR)-discontinuity, i.e., having medical information recorded outside of the study EHR system, is associated with substantial information bias in EHR-based comparative effectiveness research (CER). We aimed to develop and validate a prediction model identifying patients with high EHR-continuity to reduce this bias. Based on 183,739 patients aged ≥65 in EHRs from two US provider networks linked with Medicare claims data from 2007-2014, we quantified EHR-continuity by mean proportion of encounters captured (MPEC) by the EHR system. We built a prediction model for MPEC using one EHR system as training and the other as the validation set. Patients with top 20% predicted EHR-continuity had 3.5-5.8-fold smaller misclassification of 40 CER-relevant variables, compared to the remaining study population. The comorbidity profiles did not differ substantially by predicted EHR-continuity. These findings suggest that restriction of CER to patients with high predicted EHR-continuity may confer a favorable validity to generalizability trade-off.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Aged
  • Comorbidity
  • Comparative Effectiveness Research / statistics & numerical data*
  • Electronic Health Records / statistics & numerical data*
  • Humans