Detection of Gastrointestinal Bleeding With Large Language Models to Aid Quality Improvement and Appropriate Reimbursement

Gastroenterology. 2025 Jan;168(1):111-120.e4. doi: 10.1053/j.gastro.2024.09.014. Epub 2024 Sep 18.

Abstract

Background & aims: Early identification and accurate characterization of overt gastrointestinal bleeding (GIB) enables opportunities to optimize patient management and ensures appropriately risk-adjusted coding for claims-based quality measures and reimbursement. Recent advancements in generative artificial intelligence, particularly large language models (LLMs), create opportunities to support accurate identification of clinical conditions. In this study, we present the first LLM-based pipeline for identification of overt GIB in the electronic health record (EHR). We demonstrate 2 clinically relevant applications: the automated detection of recurrent bleeding and appropriate reimbursement coding for patients with GIB.

Methods: Development of the LLM-based pipeline was performed on 17,712 nursing notes from 1108 patients who were hospitalized with acute GIB and underwent endoscopy in the hospital from 2014 to 2023. The pipeline was used to train an EHR-based machine learning model for detection of recurrent bleeding on 546 patients presenting to 2 hospitals and externally validated on 562 patients presenting to 4 different hospitals. The pipeline was used to develop an algorithm for appropriate reimbursement coding on 7956 patients who underwent endoscopy in the hospital from 2019 to 2023.

Results: The LLM-based pipeline accurately detected melena (positive predictive value, 0.972; sensitivity, 0.900), hematochezia (positive predictive value, 0.900; sensitivity, 0.908), and hematemesis (positive predictive value, 0.859; sensitivity, 0.932). The EHR-based machine learning model identified recurrent bleeding with area under the curve of 0.986, sensitivity of 98.4%, and specificity of 97.5%. The reimbursement coding algorithm resulted in an average per-patient reimbursement increase of $1299 to $3247 with a total difference of $697,460 to $1,743,649.

Conclusions: An LLM-based pipeline can robustly detect overt GIB in the EHR with clinically relevant applications in detection of recurrent bleeding and appropriate reimbursement coding.

Keywords: Acute Gastrointestinal Bleeding; Generative Artificial Intelligence; Large Language Models; Nature Language Processing; Quality Improvement.

MeSH terms

  • Aged
  • Algorithms
  • Electronic Health Records*
  • Endoscopy, Gastrointestinal / economics
  • Endoscopy, Gastrointestinal / standards
  • Female
  • Gastrointestinal Hemorrhage* / diagnosis
  • Gastrointestinal Hemorrhage* / economics
  • Gastrointestinal Hemorrhage* / etiology
  • Gastrointestinal Hemorrhage* / therapy
  • Humans
  • Insurance, Health, Reimbursement
  • Machine Learning
  • Male
  • Middle Aged
  • Natural Language Processing
  • Predictive Value of Tests
  • Quality Improvement* / economics
  • Recurrence
  • Reproducibility of Results