Data electronically extracted from the electronic health record require validation

Lisa M Scheid; L Steven Brown; Christopher Clark; Charles R Rosenfeld

doi:10.1038/s41372-018-0311-8

Data electronically extracted from the electronic health record require validation

J Perinatol. 2019 Mar;39(3):468-474. doi: 10.1038/s41372-018-0311-8. Epub 2019 Jan 24.

Authors

Lisa M Scheid^{1

2}, L Steven Brown³, Christopher Clark³, Charles R Rosenfeld⁴

Affiliations

¹ Department of Pediatrics, University of Texas Health Science Center at Houston, Houston, TX, USA.
² Department of Pediatrics, Division of Neonatal-Perinatal Medicine, University of Texas Southwestern Medical Center, Dallas, TX, USA.
³ Parkland Health and Hospital System, Dallas, TX, USA.
⁴ Department of Pediatrics, Division of Neonatal-Perinatal Medicine, University of Texas Southwestern Medical Center, Dallas, TX, USA. charles.rosenfeld@utsouthwestern.edu.

PMID: 30679823
DOI: 10.1038/s41372-018-0311-8

Abstract

Objectives: Determine sources of error in electronically extracted data from electronic health records.

Study design: Categorical and continuous variables related to early-onset neonatal hypoglycemia were preselected and electronically extracted from records of 100 randomly selected neonates within 3479 births with laboratory-proven early-onset hypoglycemia. Extraction language was written by an information technologist and data validated by blinded manual chart review. Kappa coefficient assessed categorical variables and percent validity continuous variables.

Results: 8/23 (35%) categorical variables had acceptable Κappa (1-0.81); 5/23 (22%) had fair-slight agreement, Κappa < 0.40. Notably, "hypoglycemia" had poor agreement, Κappa 0.16. In contrast, 6/8 continuous variables had validity ≥ 94%. After correcting extraction language, 6/9 variables were corrected and inter-rater validation improved. However, "hypoglycemia" was not corrected, remaining an issue.

Conclusions: Data extraction without validation procedures, especially categorical variables using International Classification of Diseases-9 (ICD-9) codes, often results in incorrect data identification. Electronically extracted data must incorporate built-in validating processes.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Algorithms*
Data Accuracy*
Electronic Health Records*
Humans
Infant, Newborn
Information Storage and Retrieval / methods*
International Classification of Diseases
Retrospective Studies