Complete blood count as a biomarker for preeclampsia with severe features diagnosis: a machine learning approach

BMC Pregnancy Childbirth. 2024 Oct 1;24(1):628. doi: 10.1186/s12884-024-06821-4.

Abstract

Objective: This study introduces the complete blood count (CBC), a standard prenatal screening test, as a biomarker for diagnosing preeclampsia with severe features (sPE), employing machine learning models.

Methods: We used a boosting machine learning model fed with synthetic data generated through a new methodology called DAS (Data Augmentation and Smoothing). Using data from a Brazilian study including 132 pregnant women, we generated 3,552 synthetic samples for model training. To improve interpretability, we also provided a ridge regression model.

Results: Our boosting model obtained an AUROC of 0.90±0.10, sensitivity of 0.95, and specificity of 0.79 to differentiate sPE and non-PE pregnant women, using CBC parameters of neutrophils count, mean corpuscular hemoglobin (MCH), and the aggregate index of systemic inflammation (AISI). In addition, we provided a ridge regression equation using the same three CBC parameters, which is fully interpretable and achieved an AUROC of 0.79±0.10 to differentiate the both groups. Moreover, we also showed that a monocyte count lower than 490 / m m 3 yielded a sensitivity of 0.71 and specificity of 0.72.

Conclusion: Our study showed that ML-powered CBC could be used as a biomarker for sPE diagnosis support. In addition, we showed that a low monocyte count alone could be an indicator of sPE.

Significance: Although preeclampsia has been extensively studied, no laboratory biomarker with favorable cost-effectiveness has been proposed. Using artificial intelligence, we proposed to use the CBC, a low-cost, fast, and well-spread blood test, as a biomarker for sPE.

Keywords: Artificial intelligence; Complete blood count; Data augmentation; Data-centric; Machine learning; Preeclampsia; Pregnancy; Synthetic data.

MeSH terms

  • Adult
  • Biomarkers* / blood
  • Blood Cell Count / methods
  • Brazil
  • Female
  • Humans
  • Machine Learning*
  • Pre-Eclampsia* / blood
  • Pre-Eclampsia* / diagnosis
  • Pregnancy
  • Prenatal Diagnosis / methods
  • ROC Curve
  • Sensitivity and Specificity
  • Severity of Illness Index

Substances

  • Biomarkers