Fully automated deep learning for knee alignment assessment in lower extremity radiographs: a cross-sectional diagnostic study

Skeletal Radiol. 2022 Jun;51(6):1249-1259. doi: 10.1007/s00256-021-03948-9. Epub 2021 Nov 13.

Abstract

Objectives: Accurate assessment of knee alignment and leg length discrepancy is currently measured manually from standing long-leg radiographs (LLR), a process that is both time consuming and poorly reproducible. The aim was to assess the performance of a commercial available AI software by comparing its outputs with manually performed measurements.

Materials and methods: The AI was trained on over 15,000 radiographs to measure various clinical angles and lengths from LLRs. We performed a retrospective single-center analysis on 295 LLRs obtained between 2015 and 2020 from male and female patients over 18 years. AI and expert measurements were performed independently. Kellgren-Lawrence score and reading time were assessed. All measurements were compared and non-inferiority, mean-absolute-deviation (sMAD), and intra-class-correlation (ICC) were calculated.

Results: A total of 295 LLRs from 284 patients (mean age, 65 years (18; 90); 97 (34.2%) men) were analyzed. The AI model produces outputs on 98.0% of the LLRs. Manually annotations were considered as 100% accurate. For each measurement, its divergence was calculated, resulting in an overall accuracy of 89.2% when comparing the AI outputs to the manually measured. AI vs. mean observer revealed an sMAD between 0.39 and 2.19° for angles and 1.45-5.00 mm for lengths. AI showed good reliability in all lengths and angles (ICC ≥ 0.87). Non-inferiority comparing AI to the mean observer revealed an equivalence-index (γ) between 0.54 and 3.03° for angles and - 0.70-1.95 mm for lengths. On average, AI was 130 s faster than clinicians.

Conclusion: Automated measurements of knee alignment and length measurements produced with an AI tool result in reproducible, accurate measures with a time savings compared to manually acquired measurements.

Keywords: Artificial intelligence; Big data; Knee alignment; Long-leg radiographs; Standardization.

MeSH terms

  • Aged
  • Cross-Sectional Studies
  • Deep Learning*
  • Female
  • Humans
  • Lower Extremity
  • Male
  • Reproducibility of Results
  • Retrospective Studies