Predicting decompression surgery by applying multimodal deep learning to patients' structured and unstructured health data

BMC Med Inform Decis Mak. 2023 Jan 6;23(1):2. doi: 10.1186/s12911-022-02096-x.

Abstract

Background: Low back pain (LBP) is a common condition made up of a variety of anatomic and clinical subtypes. Lumbar disc herniation (LDH) and lumbar spinal stenosis (LSS) are two subtypes highly associated with LBP. Patients with LDH/LSS are often started with non-surgical treatments and if those are not effective then go on to have decompression surgery. However, recommendation of surgery is complicated as the outcome may depend on the patient's health characteristics. We developed a deep learning (DL) model to predict decompression surgery for patients with LDH/LSS.

Materials and method: We used datasets of 8387 and 8620 patients from a prospective study that collected data from four healthcare systems to predict early (within 2 months) and late surgery (within 12 months after a 2 month gap), respectively. We developed a DL model to use patients' demographics, diagnosis and procedure codes, drug names, and diagnostic imaging reports to predict surgery. For each prediction task, we evaluated the model's performance using classical and generalizability evaluation. For classical evaluation, we split the data into training (80%) and testing (20%). For generalizability evaluation, we split the data based on the healthcare system. We used the area under the curve (AUC) to assess performance for each evaluation. We compared results to a benchmark model (i.e. LASSO logistic regression).

Results: For classical performance, the DL model outperformed the benchmark model for early surgery with an AUC of 0.725 compared to 0.597. For late surgery, the DL model outperformed the benchmark model with an AUC of 0.655 compared to 0.635. For generalizability performance, the DL model outperformed the benchmark model for early surgery. For late surgery, the benchmark model outperformed the DL model.

Conclusions: For early surgery, the DL model was preferred for classical and generalizability evaluation. However, for late surgery, the benchmark and DL model had comparable performance. Depending on the prediction task, the balance of performance may shift between DL and a conventional ML method. As a result, thorough assessment is needed to quantify the value of DL, a relatively computationally expensive, time-consuming and less interpretable method.

Keywords: Classification; Decompression surgery; Deep learning; Generalizability; Lower back pain; Lumbar disc herniation; Lumbar spinal stenosis; Machine learning; Multimodal; Prediction.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Decompression, Surgical / adverse effects
  • Decompression, Surgical / methods
  • Deep Learning*
  • Humans
  • Intervertebral Disc Displacement* / surgery
  • Low Back Pain* / complications
  • Low Back Pain* / diagnosis
  • Low Back Pain* / surgery
  • Lumbar Vertebrae / surgery
  • Prospective Studies
  • Retrospective Studies
  • Spinal Stenosis* / surgery
  • Treatment Outcome