Predicting the risk category of thymoma with machine learning-based computed tomography radiomics signatures and their between-imaging phase differences

Sci Rep. 2024 Aug 19;14(1):19215. doi: 10.1038/s41598-024-69735-3.

Abstract

The aim of this study was to develop a medical imaging and comprehensive stacked learning-based method for predicting high- and low-risk thymoma. A total of 126 patients with thymomas and 5 patients with thymic carcinoma treated at our institution, including 65 low-risk patients and 66 high-risk patients, were retrospectively recruited. Among them, 78 patients composed the training cohort, while the remaining 53 patients formed the validation cohort. We extracted 1702 features each from the patients' arterial-, venous-, and plain-phase images. Pairwise subtraction of these features yielded 1702 arterial-venous, arterial-plain, and venous-plain difference features each. The Mann‒Whitney U test and least absolute shrinkage and selection operator (LASSO) and SelectKBest methods were employed to select the best features from the training set. Six models were built with a stacked learning algorithm. By applying stacked ensemble learning, three machine learning algorithms (XGBoost, multilayer perceptron (MLP), and random forest) were combined by XGBoost to produce the the six basic imaging models. Then, the XGBoost algorithm was applied to the six basic imaging models to construct a combined radiomic model. Finally, the radiomic model was combined with clinical information to create a nomogram that could easily be used in clinical practice to predict the thymoma risk category. The areas under the curve (AUCs) of the combined radiomic model in the training and validation cohorts were 0.999 (95% CI 0.988-1.000) and 0.967 (95% CI 0.916-1.000), respectively, while those of the nomogram were 0.999 (95% CI 0.996-1.000) and 0.983 (95% CI 0.990-1.000). This study describes the application of CT-based radiomics in thymoma patients and proposes a nomogram for predicting the risk category for this disease, which could be advantageous for clinical decision-making for affected patients.

Keywords: CT; Machine learning; Thymoma.

MeSH terms

  • Adult
  • Aged
  • Algorithms
  • Female
  • Humans
  • Machine Learning*
  • Male
  • Middle Aged
  • Nomograms
  • Radiomics
  • Retrospective Studies
  • Risk Assessment / methods
  • Thymoma* / diagnostic imaging
  • Thymoma* / pathology
  • Thymus Neoplasms* / diagnostic imaging
  • Thymus Neoplasms* / pathology
  • Tomography, X-Ray Computed* / methods