Screening pregnant women for suicidal behavior in electronic medical records: diagnostic codes vs. clinical notes processed by natural language processing

Qiu-Yue Zhong; Elizabeth W Karlson; Bizu Gelaye; Sean Finan; Paul Avillach; Jordan W Smoller; Tianxi Cai; Michelle A Williams

doi:10.1186/s12911-018-0617-7

Screening pregnant women for suicidal behavior in electronic medical records: diagnostic codes vs. clinical notes processed by natural language processing

BMC Med Inform Decis Mak. 2018 May 29;18(1):30. doi: 10.1186/s12911-018-0617-7.

Authors

Qiu-Yue Zhong¹, Elizabeth W Karlson², Bizu Gelaye³, Sean Finan⁴, Paul Avillach^{3

4

5}, Jordan W Smoller^{3

6}, Tianxi Cai⁷, Michelle A Williams³

Affiliations

¹ Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA. qyzhong@mail.harvard.edu.
² Department of Medicine, Division of Rheumatology, Immunology and Allergy, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA.
³ Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA.
⁴ Children's Hospital Informatics Program, Boston Children's Hospital, Boston, MA, USA.
⁵ Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.
⁶ Psychiatric and Neurodevelopmental Genetics Unit, Massachusetts General Hospital, Boston, MA, USA.
⁷ Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA.

Abstract

Background: We examined the comparative performance of structured, diagnostic codes vs. natural language processing (NLP) of unstructured text for screening suicidal behavior among pregnant women in electronic medical records (EMRs).

Methods: Women aged 10-64 years with at least one diagnostic code related to pregnancy or delivery (N = 275,843) from Partners HealthCare were included as our "datamart." Diagnostic codes related to suicidal behavior were applied to the datamart to screen women for suicidal behavior. Among women without any diagnostic codes related to suicidal behavior (n = 273,410), 5880 women were randomly sampled, of whom 1120 had at least one mention of terms related to suicidal behavior in clinical notes. NLP was then used to process clinical notes for the 1120 women. Chart reviews were performed for subsamples of women.

Results: Using diagnostic codes, 196 pregnant women were screened positive for suicidal behavior, among whom 149 (76%) had confirmed suicidal behavior by chart review. Using NLP among those without diagnostic codes, 486 pregnant women were screened positive for suicidal behavior, among whom 146 (30%) had confirmed suicidal behavior by chart review.

Conclusions: The use of NLP substantially improves the sensitivity of screening suicidal behavior in EMRs. However, the prevalence of confirmed suicidal behavior was lower among women who did not have diagnostic codes for suicidal behavior but screened positive by NLP. NLP should be used together with diagnostic codes for future EMR-based phenotyping studies for suicidal behavior.

Keywords: Clinical notes; Diagnostic codes; Electronic medical records; Natural language processing; Pregnancy; Screening; Suicidal behavior.

Publication types

Research Support, N.I.H., Extramural

MeSH terms

Adolescent
Adult
Child
Electronic Health Records / statistics & numerical data*
Female
Humans
Massachusetts / epidemiology
Middle Aged
Natural Language Processing*
Pregnancy
Pregnancy Complications / diagnosis*
Registries / statistics & numerical data*
Suicide, Attempted*
Young Adult

Abstract

Publication types

MeSH terms

Grants and funding