Using natural language processing to identify the status of homelessness and housing instability among serious illness patients from clinical notes in an integrated healthcare system

JAMIA Open. 2023 Sep 22;6(3):ooad082. doi: 10.1093/jamiaopen/ooad082. eCollection 2023 Oct.

Abstract

Background: Efficiently identifying the social risks of patients with serious illnesses (SIs) is the critical first step in providing patient-centered and value-driven care for this medically vulnerable population.

Objective: To apply and further hone an existing natural language process (NLP) algorithm that identifies patients who are homeless/at risk of homeless to a SI population.

Methods: Patients diagnosed with SI between 2019 and 2020 were identified using an adapted list of diagnosis codes from the Center for Advance Palliative Care from the Kaiser Permanente Southern California electronic health record. Clinical notes associated with medical encounters within 6 months before and after the diagnosis date were processed by a previously developed NLP algorithm to identify patients who were homeless/at risk of homelessness. To improve the generalizability to the SI population, the algorithm was refined by multiple iterations of chart review and adjudication. The updated algorithm was then applied to the SI population.

Results: Among 206 993 patients with a SI diagnosis, 1737 (0.84%) were identified as homeless/at risk of homelessness. These patients were more likely to be male (51.1%), age among 45-64 years (44.7%), and have one or more emergency visit (65.8%) within a year of their diagnosis date. Validation of the updated algorithm yielded a sensitivity of 100.0% and a positive predictive value of 93.8%.

Conclusions: The improved NLP algorithm effectively identified patients with SI who were homeless/at risk of homelessness and can be used to target interventions for this vulnerable group.

Keywords: electronic health record; homeless; natural language processing; residential instability; serious illness.