Computer-Aided Detection AI Reduces Interreader Variability in Grading Hip Abnormalities With MRI

Radhika Tibrewala; Eugene Ozhinsky; Rutwik Shah; Io Flament; Kay Crossley; Ramya Srinivasan; Richard Souza; Thomas M Link; Valentina Pedoia; Sharmila Majumdar

doi:10.1002/jmri.27164

Computer-Aided Detection AI Reduces Interreader Variability in Grading Hip Abnormalities With MRI

J Magn Reson Imaging. 2020 Oct;52(4):1163-1172. doi: 10.1002/jmri.27164. Epub 2020 Apr 15.

Authors

Affiliations

¹ Department of Radiology and Biomedical Imaging, University of California, San Francisco, California, USA.
² La Trobe Sport and Exercise Medicine Research Centre, College of Science, Health and Engineering, La Trobe University, Melbourne, Victoria, Australia.
³ Department of Physical Therapy and Rehabilitation Science, University of California San Francisco, San Francisco, California, USA.

Abstract

Background: Accurate interpretation of hip MRI is time-intensive and difficult, prone to inter- and intrareviewer variability, and lacks a universally accepted grading scale to evaluate morphological abnormalities.

Purpose: To 1) develop and evaluate a deep-learning-based model for binary classification of hip osteoarthritis (OA) morphological abnormalities on MR images, and 2) develop an artificial intelligence (AI)-based assist tool to find if using the model predictions improves interreader agreement in hip grading.

Study type: Retrospective study aimed to evaluate a technical development.

Population: A total of 764 MRI volumes (364 patients) obtained from two studies (242 patients from LASEM [FORCe] and 122 patients from UCSF), split into a 65-25-10% train, validation, test set for network training.

Field strength/sequence: 3T MRI, 2D T₂ FSE, PD SPAIR.

Assessment: Automatic binary classification of cartilage lesions, bone marrow edema-like lesions, and subchondral cyst-like lesions using the MRNet, interreader agreement before and after using network predictions.

Statistical tests: Receiver operating characteristic (ROC) curve, area under curve (AUC), specificity and sensitivity, and balanced accuracy.

Results: For cartilage lesions, bone marrow edema-like lesions and subchondral cyst-like lesions the AUCs were: 0.80 (95% confidence interval [CI] 0.65, 0.95), 0.84 (95% CI 0.67, 1.00), and 0.77 (95% CI 0.66, 0.85), respectively. The sensitivity and specificity of the radiologist for binary classification were: 0.79 (95% CI 0.65, 0.93) and 0.80 (95% CI 0.59, 1.02), 0.40 (95% CI -0.02, 0.83) and 0.72 (95% CI 0.59, 0.86), 0.75 (95% CI 0.45, 1.05) and 0.88 (95% CI 0.77, 0.98). The interreader balanced accuracy increased from 53%, 71% and 56% to 60%, 73% and 68% after using the network predictions and saliency maps.

Data conclusion: We have shown that a deep-learning approach achieved high performance in clinical classification tasks on hip MR images, and that using the predictions from the deep-learning model improved the interreader agreement in all pathologies.

Level of evidence: 3 TECHNICAL EFFICACY STAGE: 1 J. Magn. Reson. Imaging 2020;52:1163-1172.

Keywords: MRI; cartilage; deep learning; detection; hip abnormality; osteoarthritis.

Publication types

Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't

MeSH terms

Artificial Intelligence*
Computers
Humans
Image Interpretation, Computer-Assisted*
Magnetic Resonance Imaging
Reproducibility of Results
Retrospective Studies

Abstract

Publication types

MeSH terms

Grants and funding