Predicting Breast Cancer Relapse from Histopathological Images with Ensemble Machine Learning Models

Curr Oncol. 2024 Oct 24;31(11):6577-6597. doi: 10.3390/curroncol31110486.

Abstract

Relapse and metastasis occur in 30-40% of breast cancer patients, even after targeted treatments like trastuzumab for HER2-positive breast cancer. Accurate individual prognosis is essential for determining appropriate adjuvant treatment and early intervention. This study aims to enhance relapse and metastasis prediction using an innovative framework with machine learning (ML) and ensemble learning (EL) techniques. The developed framework is analyzed using The Cancer Genome Atlas (TCGA) data, which has 123 HER2-positive breast cancer patients. Our two-stage experimental approach first applied six basic ML models (support vector machine, logistic regression, decision tree, random forest, adaptive boosting, and extreme gradient boosting) and then ensembled these models using weighted averaging, soft voting, and hard voting techniques. The weighted averaging ensemble approach achieved enhanced performances of 88.46% accuracy, 89.74% precision, 94.59% sensitivity, 73.33% specificity, 92.11% F-Value, 71.07% Mathew's correlation coefficient, and an AUC of 0.903. This framework enables the accurate prediction of relapse and metastasis in HER2-positive breast cancer patients using H&E images and clinical data, thereby assisting in better treatment decision-making.

Keywords: breast cancer relapse; cancer diagnosis; ensemble learning; histopathological images; machine learning.

MeSH terms

  • Breast Neoplasms* / pathology
  • Female
  • Humans
  • Machine Learning*
  • Middle Aged
  • Neoplasm Recurrence, Local*

Grants and funding

This research received no external funding.