Background and purpose: Artificial intelligence advances have stimulated a new generation of autosegmentation, however clinical evaluations of these algorithms are lacking. This study assesses the clinical utility of deep learning-based autosegmentation for MR-based prostate radiotherapy planning.
Materials and methods: Data was collected prospectively for patients undergoing prostate-only radiation at our institution from June to December 2019. Geometric indices (volumetric Dice-Sørensen Coefficient, VDSC; surface Dice-Sørensen Coefficient, SDSC; added path length, APL) compared automated to final contours. Physicians reported contouring time and rated autocontours on 3-point protocol deviation scales. Descriptive statistics and univariable analyses evaluated relationships between the aforementioned metrics.
Results: Among 173 patients, 85% received SBRT. The CTV was available for 167 (97%) with median VDSC, SDSC, and APL for CTV (prostate and SV) 0.89 (IQR 0.83-0.95), 0.91 (IQR 0.75-0.96), and 1801 mm (IQR 1140-2703), respectively. Physicians completed surveys for 43/55 patients (RR 78%). 33% of autocontours (14/43) required major "clinically significant" edits. Physicians spent a median of 28 min contouring (IQR 20-30), representing a 12-minute (30%) time savings compared to historic controls (median 40, IQR 25-68, n = 21, p < 0.01). Geometric indices correlated weakly with contouring time, and had no relationship with quality scores.
Conclusion: Deep learning-based autosegmentation was implemented successfully and improved efficiency. Major "clinically significant" edits are uncommon and do not correlate with geometric indices. APL was supported as a clinically meaningful quantitative metric. Efforts are needed to educate and generate consensus among physicians, and develop mechanisms to flag cases for quality assurance.
Keywords: Deep learning; Program evaluation; Prostatic neoplasms; Radiation oncology; Radiologic technology.
Copyright © 2021 Elsevier B.V. All rights reserved.