Background: Respiratory motion during radiotherapy (RT) may reduce the therapeutic effect and increase the dose received by organs at risk. This can be addressed by real-time tracking, where respiration motion prediction is currently required to compensate for system latency in RT systems. Notably, for the prediction of future images in image-guided adaptive RT systems, the use of deep learning has been considered.
Purpose: This study proposed a modified generative adversarial network (GAN) for predicting cine-MR images in real time.
Methods: Sagittal cine magnetic resonance (cine-MR) images of 15 patients with liver cancer who received RT were collected. The image series length of each patient was 300, and each series was divided into training, validation, and test sets. The datasets were further divided using a sliding window size of 10 and a stride of 1. A pix2pix GAN with the generator replaced by convolutional long short-term memory (ConvLSTM) was proposed herein. A five-frame cine-MR image series was inputted into the network, which predicted the next five frames. The proposed network was compared with three advanced networks: ConvLSTM, Eidetic 3D LSTM (E3D-LSTM), and SwinLSTM. Personalized models were trained for each patient. The peak signal-to-noise ratio (PSNR), structural similarity index measure (SSIM), visual information fidelity (VIF), Pearson correlation coefficient (Pearson corr), and respiratory motion accuracy of the predicted images were used to evaluate the methods.
Results: The proposed network demonstrated optimal performance in the four networks across various indicators. The proposed method provided better SSIM values than ConvLSTM at time steps 1, 2, 3, and 4, and outperformed E3DLSTM at all time steps. In terms of the VIF, the proposed method outperformed E3D-LSTM at all time steps and SwinLSTM at time steps 2, 3, 4, and 5. The proposed method was not significantly different from other methods in terms of Pearson correlation values except that it outperformed E3DLSTM at time step 1. In terms of the Pearson corr, the proposed method consistently achieves better values, especially in the high-frequency components. Low average landmark tracking errors were provided by the proposed method at time steps 4 and 5 (2.42 ± 0.91 and 2.44 ± 0.96 mm, respectively).
Conclusions: The GAN-ConvLSTM network can generate high-acutance real-time cine-MR images and predict respiratory motion with better accuracy.
Keywords: cine‐MR images; convolutional long short‐term memory; deep learning; generative adversarial network; radiotherapy; respiratory motion prediction.
© 2025 American Association of Physicists in Medicine.