Multi-vendor robustness analysis of a commercial artificial intelligence system for breast cancer detection

J Med Imaging (Bellingham). 2023 Sep;10(5):051807. doi: 10.1117/1.JMI.10.5.051807. Epub 2023 Apr 18.

Abstract

Purpose: Population-based screening programs for the early detection of breast cancer have significantly reduced mortality in women, but they are resource intensive in terms of time, cost, and workload and still have limitations mainly due to the use of 2D imaging techniques, which may cause overlapping of tissues, and interobserver variability. Artificial intelligence (AI) systems may be a valuable tool to assist radiologist when reading and classifying mammograms based on the malignancy of the detected lesions. However, there are several factors that can influence the outcome of a mammogram and thus also the detection capability of an AI system. The aim of our work is to analyze the robustness of the diagnostic ability of an AI system designed for breast cancer detection.

Approach: Mammograms from a population-based screening program were scored with the AI system. The sensitivity and specificity by means of the area under the receiver operating characteristic (ROC) curve were obtained as a function of the mammography unit manufacturer, demographic characteristics, and several factors that may affect the image quality (age, breast thickness and density, compression applied, beam quality, and delivered dose).

Results: The area under the curve (AUC) from the scoring ROC curve was 0.92 (95% confidence interval = 0.89 - 0.95). It showed no dependence with any of the parameters considered, as the differences in the AUC for different interval values were not statistically significant.

Conclusion: The results suggest that the AI system analyzed in our work has a robust diagnostic capability, and that its accuracy is independent of the studied parameters.

Keywords: area under the curve; artificial intelligence; breast cancer; mammography; receiver operating characteristic curve; screening.