Performance of Deep Learning and Genitourinary Radiologists in Detection of Prostate Cancer Using 3-T Multiparametric Magnetic Resonance Imaging

J Magn Reson Imaging. 2021 Aug;54(2):474-483. doi: 10.1002/jmri.27595. Epub 2021 Mar 12.

Abstract

Background: Several deep learning-based techniques have been developed for prostate cancer (PCa) detection using multiparametric magnetic resonance imaging (mpMRI), but few of them have been rigorously evaluated relative to radiologists' performance or whole-mount histopathology (WMHP).

Purpose: To compare the performance of a previously proposed deep learning algorithm, FocalNet, and expert radiologists in the detection of PCa on mpMRI with WMHP as the reference.

Study type: Retrospective, single-center study.

Subjects: A total of 553 patients (development cohort: 427 patients; evaluation cohort: 126 patients) who underwent 3-T mpMRI prior to radical prostatectomy from October 2010 to February 2018.

Field strength/sequence: 3-T, T2-weighted imaging and diffusion-weighted imaging.

Assessment: FocalNet was trained on the development cohort to predict PCa locations by detection points, with a confidence value for each point, on the evaluation cohort. Four fellowship-trained genitourinary (GU) radiologists independently evaluated the evaluation cohort to detect suspicious PCa foci, annotate detection point locations, and assign a five-point suspicion score (1: least suspicious, 5: most suspicious) for each annotated detection point. The PCa detection performance of FocalNet and radiologists were evaluated by the lesion detection sensitivity vs. the number of false-positive detections at different thresholds on suspicion scores. Clinically significant lesions: Gleason Group (GG) ≥ 2 or pathological size ≥ 10 mm. Index lesions: the highest GG and the largest pathological size (secondary).

Statistical tests: Bootstrap hypothesis test for the detection sensitivity between radiologists and FocalNet.

Results: For the overall differential detection sensitivity, FocalNet was 5.1% and 4.7% below the radiologists for clinically significant and index lesions, respectively; however, the differences were not statistically significant (P = 0.413 and P = 0.282, respectively).

Data conclusion: FocalNet achieved slightly lower but not statistically significant PCa detection performance compared with GU radiologists. Compared with radiologists, FocalNet demonstrated similar detection performance for a highly sensitive setting (suspicion score ≥ 1) or a highly specific setting (suspicion score = 5), while lower performance in between.

Level of evidence: 3 TECHNICAL EFFICACY STAGE: 2.

Keywords: automatic cancer detection; deep learning; multiparametric MRI; prostate cancer.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Deep Learning*
  • Humans
  • Magnetic Resonance Imaging
  • Male
  • Multiparametric Magnetic Resonance Imaging*
  • Prostatic Neoplasms* / diagnostic imaging
  • Radiologists
  • Retrospective Studies