Objective: This study introduces the complete blood count (CBC), a standard prenatal screening test, as a biomarker for diagnosing preeclampsia with severe features (sPE), employing machine learning models.
Methods: We used a boosting machine learning model fed with synthetic data generated through a new methodology called DAS (Data Augmentation and Smoothing). Using data from a Brazilian study including 132 pregnant women, we generated 3,552 synthetic samples for model training. To improve interpretability, we also provided a ridge regression model.
Results: Our boosting model obtained an AUROC of 0.90±0.10, sensitivity of 0.95, and specificity of 0.79 to differentiate sPE and non-PE pregnant women, using CBC parameters of neutrophils count, mean corpuscular hemoglobin (MCH), and the aggregate index of systemic inflammation (AISI). In addition, we provided a ridge regression equation using the same three CBC parameters, which is fully interpretable and achieved an AUROC of 0.79±0.10 to differentiate the both groups. Moreover, we also showed that a monocyte count lower than yielded a sensitivity of 0.71 and specificity of 0.72.
Conclusion: Our study showed that ML-powered CBC could be used as a biomarker for sPE diagnosis support. In addition, we showed that a low monocyte count alone could be an indicator of sPE.
Significance: Although preeclampsia has been extensively studied, no laboratory biomarker with favorable cost-effectiveness has been proposed. Using artificial intelligence, we proposed to use the CBC, a low-cost, fast, and well-spread blood test, as a biomarker for sPE.
Keywords: Artificial intelligence; Complete blood count; Data augmentation; Data-centric; Machine learning; Preeclampsia; Pregnancy; Synthetic data.
© 2024. The Author(s).