Background: Predicting the risk of complications is critical in metabolic and bariatric surgery (MBS).
Objectives: To develop machine learning (ML) models to predict serious postoperative complications of MBS and evaluate racial fairness of the models.
Setting: Metabolic and Bariatric Surgery Accreditation and Quality Improvement Program (MBSAQIP) national database, United States.
Methods: We developed logistic regression, random forest (RF), gradient-boosted tree (GBT), and XGBoost model using the MBSAQIP Participant Use Data File from 2016 to 2020. To address the class imbalance, we randomly undersampled the complication-negative class to match the complication-positive class. Model performance was evaluated using the area under the receiver operating characteristic curve (AUROC), precision, recall, and F1 score. Fairness across White and non-White patient groups was assessed using equal opportunity difference and disparate impact metrics.
Results: A total of 40,858 patients were included after undersampling the complication-negative class. The XGBoost model was the best-performing model in terms of AUROC; however, the difference was not statistically significant. While the F1 score and precision did not vary significantly across models, the RF exhibited better recall compared to the logistic regression. Surgery type was the most important feature to predict complications, followed by operative time. The logistic regression model had the best fairness metrics for race.
Conclusions: The XGBoost model achieved the highest AUROC, albeit without a statistically significant difference. The RF may be useful when recall is the primary concern. Undersampling of the privileged group may improve the fairness of boosted tree models.
Keywords: Complication; Machine learning; Metabolic and bariatric surgery; Roux-en-Y gastric bypass; Sleeve gastrectomy.
Copyright © 2024 American Society for Metabolic and Bariatric Surgery. Published by Elsevier Inc. All rights reserved.