A machine learning ensemble approach for 5- and 10-year breast cancer invasive disease event classification

PLoS One. 2022 Sep 19;17(9):e0274691. doi: 10.1371/journal.pone.0274691. eCollection 2022.

Abstract

Designing targeted treatments for breast cancer patients after primary tumor removal is necessary to prevent the occurrence of invasive disease events (IDEs), such as recurrence, metastasis, contralateral and second tumors, over time. However, due to the molecular heterogeneity of this disease, predicting the outcome and efficacy of the adjuvant therapy is challenging. A novel ensemble machine learning classification approach was developed to address the task of producing prognostic predictions of the occurrence of breast cancer IDEs at both 5- and 10-years. The method is based on the concept of voting among multiple models to give a final prediction for each individual patient. Promising results were achieved on a cohort of 529 patients, whose data, related to primary breast cancer, were provided by Istituto Tumori "Giovanni Paolo II" in Bari, Italy. Our proposal greatly improves the performances returned by the baseline original model, i.e., without voting, finally reaching a median AUC value of 77.1% and 76.3% for the IDE prediction at 5-and 10-years, respectively. Finally, the proposed approach allows to promote more intelligible decisions and then a greater acceptability in clinical practice since it returns an explanation of the IDE prediction for each individual patient through the voting procedure.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Breast Neoplasms* / pathology
  • Combined Modality Therapy
  • Female
  • Humans
  • Italy
  • Machine Learning

Grants and funding

This work was supported by funding from the Italian Ministry of Health, Ricerca Finalizzata 2018 deliberation n.812/2020.