Acute ischemic stroke (AIS) remains a global health challenge, leading to long-term functional disabilities without timely intervention. Spatio-temporal (4D) Computed Tomography Perfusion (CTP) imaging is crucial for diagnosing and treating AIS due to its ability to rapidly assess the ischemic core and penumbra. Although traditionally used to assess acute tissue status in clinical settings, 4D CTP has also been explored in research for predicting stroke tissue outcomes. However, its potential for predicting functional outcomes, especially in combination with clinical metadata, remains unexplored. Thus, this work aims to develop and evaluate a novel multimodal deep learning model for predicting functional outcomes (specifically, 90-day modified Rankin Scale) in AIS patients by combining 4D CTP and clinical metadata. To achieve this, an intermediate fusion strategy with a cross-attention mechanism is introduced to enable a selective focus on the most relevant features and patterns from both modalities. Evaluated on a dataset comprising 70 AIS patients who underwent endovascular mechanical thrombectomy, the proposed model achieves an accuracy (ACC) of 0.77, outperforming conventional late fusion strategies (ACC = 0.73) and unimodal models based on either 4D CTP (ACC = 0.61) or clinical metadata (ACC = 0.71). The results demonstrate the superior capability of the proposed model to leverage complex inter-modal relationships, emphasizing the value of advanced multimodal fusion techniques for predicting functional stroke outcomes.
Keywords: Cross-attention; Multimodal learning; Outcome prediction; Stroke.
Copyright © 2024 The Authors. Published by Elsevier B.V. All rights reserved.