Comparative effectiveness research (CER) provides evidence for the relative effectiveness and risks of different treatment options and informs decisions made by healthcare providers, payers, and pharmaceutical companies. CER data come from retrospective analyses as well as prospective clinical trials. Here, we describe the development of a text-mining pipeline based on natural language processing (NLP) that extracts key information from three different trial data sources: NIH ClinicalTrials.gov, WHO International Clinical Trials Registry Platform (ICTRP), and Citeline Trialtrove. The pipeline leverages tailored terminologies to produce an integrated and structured output, capturing any trials in which pharmaceutical products of interest are compared with another therapy. The timely information alerts generated by this system provide the earliest and most complete picture of emerging clinical research.
Copyright © 2016 Elsevier Ltd. All rights reserved.