Developing a common data model approach for DISCOVER CKD: A retrospective, global cohort of real-world patients with chronic kidney disease

Supriya Kumar; Matthew Arnold; Glen James; Rema Padman

doi:10.1371/journal.pone.0274131

Developing a common data model approach for DISCOVER CKD: A retrospective, global cohort of real-world patients with chronic kidney disease

PLoS One. 2022 Sep 29;17(9):e0274131. doi: 10.1371/journal.pone.0274131. eCollection 2022.

Authors

Supriya Kumar¹, Matthew Arnold², Glen James³, Rema Padman⁴

Affiliations

¹ Real World Evidence Data and Analytics, BioPharmaceuticals Medical, AstraZeneca, Gaithersburg, MD, United States of America.
² Real World Evidence Data and Analytics, BioPharmaceuticals Medical, AstraZeneca, Cambridge, United Kingdom.
³ Formerly Cardiovascular, Renal, Metabolism & Epidemiology, BioPharmaceuticals Medical, AstraZeneca, Cambridge, United Kingdom.
⁴ Heinz College of Information Systems and Public Policy, Carnegie Mellon University, Pittsburgh, PA, United States of America.

Abstract

Objectives: To describe a flexible common data model (CDM) approach that can be efficiently tailored to study-specific needs to facilitate pooled patient-level analysis and aggregated/meta-analysis of routinely collected retrospective patient data from disparate data sources; and to detail the application of this CDM approach to the DISCOVER CKD retrospective cohort, a longitudinal database of routinely collected (secondary) patient data of individuals with chronic kidney disease (CKD).

Methods: The flexible CDM approach incorporated three independent, exchangeable components that preceded data mapping and data model implementation: (1) standardized code lists (unifying medical events from different coding systems); (2) laboratory unit harmonization tables; and (3) base cohort definitions. Events between different coding vocabularies were not mapped code-to-code; for each data source, code lists of labels were curated at the entity/event level. A study team of epidemiologists, clinicians, informaticists, and data scientists were included within the validation of each component.

Results: Applying the CDM to the DISCOVER CKD retrospective cohort, secondary data from 1,857,593 patients with CKD were harmonized from five data sources, across three countries, into a discrete database for rapid real-world evidence generation.

Conclusions: This flexible CDM approach facilitates evidence generation from real-world data within the DISCOVER CKD retrospective cohort, providing novel insights into the epidemiology of CKD that may expedite improvements in diagnosis, prognosis, early intervention, and disease management. The adaptable architecture of this CDM approach ensures scalable, fast, and efficient application within other therapy areas to facilitate the combined analysis of different types of secondary data from multiple, heterogeneous sources.

MeSH terms

Cohort Studies
Databases, Factual
Disease Management
Humans
Renal Insufficiency, Chronic* / diagnosis
Renal Insufficiency, Chronic* / epidemiology
Retrospective Studies

Grants and funding

This manuscript, including medical writing and editorial support, was funded by AstraZeneca. The sponsor was involved in the study design and collection, analysis and interpretation of data, as well as data checking of information provided in the manuscript. SK is an employee and stockholder of AstraZeneca. MA is an employee of AstraZeneca. GJ was an employee of AstraZeneca at the time of the study. Ultimate responsibility for opinions, conclusions and data interpretation lies with the authors.