From narrative descriptions to MedDRA: automagically encoding adverse drug reactions

Carlo Combi; Margherita Zorzi; Gabriele Pozzani; Ugo Moretti; Elena Arzenton

doi:10.1016/j.jbi.2018.07.001

From narrative descriptions to MedDRA: automagically encoding adverse drug reactions

J Biomed Inform. 2018 Aug:84:184-199. doi: 10.1016/j.jbi.2018.07.001. Epub 2018 Jul 4.

Authors

Carlo Combi¹, Margherita Zorzi², Gabriele Pozzani³, Ugo Moretti⁴, Elena Arzenton⁵

Affiliations

¹ Department of Computer Science, University of Verona, Italy. Electronic address: carlo.combi@univr.it.
² Department of Computer Science, University of Verona, Italy. Electronic address: margherita.zorzi@univr.it.
³ Department of Diagnostics and Public Health, University of Verona, Italy. Electronic address: gabriele.pozzani@univr.it.
⁴ Department of Diagnostics and Public Health, University of Verona, Italy. Electronic address: ugo.moretti@univr.it.
⁵ Department of Diagnostics and Public Health, University of Verona, Italy. Electronic address: elena.arzenton@univr.it.

PMID: 29981491
DOI: 10.1016/j.jbi.2018.07.001

Abstract

Context: The collection of narrative spontaneous reports is an irreplaceable source for the prompt detection of suspected adverse drug reactions (ADRs). In such task qualified domain experts manually revise a huge amount of narrative descriptions and then encode texts according to MedDRA standard terminology. The manual annotation of narrative documents with medical terminology is a subtle and expensive task, since the number of reports is growing up day-by-day.

Objectives: Natural Language Processing (NLP) applications can support the work of people responsible for pharmacovigilance. Our objective is to develop NLP algorithms and tools for the detection of ADR clinical terminology. Efficient applications can concretely improve the quality of the experts' revisions. NLP software can quickly analyze narrative texts and offer an encoding (i.e., a list of MedDRA terms) that the expert has to revise and validate.

Methods: MagiCoder, an NLP algorithm, is proposed for the automatic encoding of free-text descriptions into MedDRA terms. MagiCoder procedure is efficient in terms of computational complexity. We tested MagiCoder through several experiments. In the first one, we tested it on a large dataset of about 4500 manually revised reports, by performing an automated comparison between human and MagiCoder encoding. Moreover, we tested MagiCoder on a set of about 1800 reports, manually revised ex novo by some experts of the domain, who also compared automatic solutions with the gold reference standard. We also provide two initial experiments with reports written in English, giving a first evidence of the robustness of MagiCoder w.r.t. the change of the language.

Results: For the current base version of MagiCoder, we measured an average recall and precision of 86.9% and 91.8%, respectively.

Conclusions: From a practical point of view, MagiCoder reduces the time required for encoding ADR reports. Pharmacologists have only to review and validate the MedDRA terms proposed by the application, instead of choosing the right terms among the 70 K low level terms of MedDRA. Such improvement in the efficiency of pharmacologists' work has a relevant impact also on the quality of the subsequent data analysis. We developed MagiCoder for the Italian pharmacovigilance language. However, our proposal is based on a general approach, not depending on the considered language nor the term dictionary.

Keywords: Adverse drug reactions; Healthcare informatics; Natural language processing; Pharmacovigilance; Term identification.

MeSH terms

Adverse Drug Reaction Reporting Systems*
Algorithms
Computer Systems
Data Mining / methods*
Decision Support Systems, Clinical
Drug-Related Side Effects and Adverse Reactions
False Positive Reactions
Humans
Italy
Language
Narration
Natural Language Processing
Pattern Recognition, Automated
Pharmacovigilance*
Reproducibility of Results
Software