Design and validation of a FHIR-based EHR-driven phenotyping toolbox

Pascal S Brandt; Jennifer A Pacheco; Prakash Adekkanattu; Evan T Sholle; Sajjad Abedian; Daniel J Stone; David M Knaack; Jie Xu; Zhenxing Xu; Yifan Peng; Natalie C Benda; Fei Wang; Yuan Luo; Guoqian Jiang; Jyotishman Pathak; Luke V Rasmussen

doi:10.1093/jamia/ocac063

Design and validation of a FHIR-based EHR-driven phenotyping toolbox

J Am Med Inform Assoc. 2022 Aug 16;29(9):1449-1460. doi: 10.1093/jamia/ocac063.

Authors

Affiliations

¹ Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, Washington, USA.
² Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, Illinois, USA.
³ Department of Healthcare Policy and Research, Weill Cornell Medicine, New York, New York, USA.
⁴ Department of Artificial Intelligence and Informatics, Mayo Clinic, Rochester, Minnesota, USA.

Abstract

Objectives: To develop and validate a standards-based phenotyping tool to author electronic health record (EHR)-based phenotype definitions and demonstrate execution of the definitions against heterogeneous clinical research data platforms.

Materials and methods: We developed an open-source, standards-compliant phenotyping tool known as the PhEMA Workbench that enables a phenotype representation using the Fast Healthcare Interoperability Resources (FHIR) and Clinical Quality Language (CQL) standards. We then demonstrated how this tool can be used to conduct EHR-based phenotyping, including phenotype authoring, execution, and validation. We validated the performance of the tool by executing a thrombotic event phenotype definition at 3 sites, Mayo Clinic (MC), Northwestern Medicine (NM), and Weill Cornell Medicine (WCM), and used manual review to determine precision and recall.

Results: An initial version of the PhEMA Workbench has been released, which supports phenotype authoring, execution, and publishing to a shared phenotype definition repository. The resulting thrombotic event phenotype definition consisted of 11 CQL statements, and 24 value sets containing a total of 834 codes. Technical validation showed satisfactory performance (both NM and MC had 100% precision and recall and WCM had a precision of 95% and a recall of 84%).

Conclusions: We demonstrate that the PhEMA Workbench can facilitate EHR-driven phenotype definition, execution, and phenotype sharing in heterogeneous clinical research data environments. A phenotype definition that integrates with existing standards-compliant systems, and the use of a formal representation facilitates automation and can decrease potential for human error.

Keywords: CQL; EHR-driven phenotyping; FHIR; cohort identification.

Publication types

Research Support, Non-U.S. Gov't
Research Support, N.I.H., Extramural

MeSH terms

Electronic Health Records*
Humans
Language
Phenotype
Polyhydroxyethyl Methacrylate*

Substances

Polyhydroxyethyl Methacrylate

Abstract

Publication types

MeSH terms

Substances

Grants and funding