Extracting Sexual Trauma Mentions from Electronic Medical Notes Using Natural Language Processing

Stud Health Technol Inform. 2017:245:351-355.

Abstract

Patient history of sexual trauma is of clinical relevance to healthcare providers as survivors face adverse health-related outcomes. This paper describes a method for identifying mentions of sexual trauma within the free text of electronic medical notes. A natural language processing pipeline for information extraction was developed and scaled to handle a large corpus of electronic medical notes used for this study from US Veterans Health Administration medical facilities. The tool was used to identify sexual trauma mentions and create snippets around every asserted mention based on a domain-specific lexicon developed for this purpose. All snippets were evaluated by trained human reviewers. An overall positive predictive value (PPV) of 0.90 for identifying sexual trauma mentions from the free text and a PPV of 0.71 at the patient level are reported. The metrics are superior for records from female patients.

Keywords: Information Retrieval; Natural Language Processing; Trauma and Stressor Related Disorders.

MeSH terms

  • Electronic Health Records*
  • Female
  • Humans
  • Information Storage and Retrieval
  • Natural Language Processing*