Evaluating and enhancing the robustness of vision transformers against adversarial attacks in medical imaging

Med Biol Eng Comput. 2024 Oct 25. doi: 10.1007/s11517-024-03226-5. Online ahead of print.

Abstract

Deep neural networks (DNNs) have demonstrated exceptional performance in medical image analysis. However, recent studies have uncovered significant vulnerabilities in DNN models, particularly their susceptibility to adversarial attacks that manipulate these models into making inaccurate predictions. Vision Transformers (ViTs), despite their advanced capabilities in medical imaging tasks, have not been thoroughly evaluated for their robustness against such attacks in this domain. This study addresses this research gap by conducting an extensive analysis of various adversarial attacks on ViTs specifically within medical imaging contexts. We explore adversarial training as a potential defense mechanism and assess the resilience of ViT models against state-of-the-art adversarial attacks and defense strategies using publicly available benchmark medical image datasets. Our findings reveal that ViTs are vulnerable to adversarial attacks even with minimal perturbations, although adversarial training significantly enhances their robustness, achieving over 80% classification accuracy. Additionally, we perform a comparative analysis with state-of-the-art convolutional neural network models, highlighting the unique strengths and weaknesses of ViTs in handling adversarial threats. This research advances the understanding of ViTs robustness in medical imaging and provides insights into their practical deployment in real-world scenarios.

Keywords: Adversarial attacks; Adversarial defense; Medical image classification; Vision transformer.

Grants and funding