Gestational diabetes mellitus (GDM), a common perinatal disease, is related to increased risks of maternal and neonatal adverse perinatal outcomes. We aimed to establish GDM risk prediction models that can be widely used in the first trimester using four different methods, including a score-scaled model derived from a meta-analysis using 42 studies, a logistic regression model, and two machine learning models (decision tree and random forest algorithms). The score-scaled model (seven variables) was established via a meta-analysis and a stratified cohort of 1075 Chinese pregnant women from the Northwest Women's and Children's Hospital (NWCH) and showed an area under the curve (AUC) of 0.772. The logistic regression model (seven variables) was established and validated using the above cohort and showed AUCs of 0.799 and 0.834 for the training and validation sets, respectively. Another two models were established using the decision tree (DT) and random forest (RF) algorithms and showed corresponding AUCs of 0.825 and 0.823 for the training set, and 0.816 and 0.827 for the validation set. The validation of the developed models suggested good performance in a cohort derived from another period. The score-scaled GDM prediction model, the logistic regression GDM prediction model, and the two machine learning GDM prediction models could be employed to identify pregnant women with a high risk of GDM using common clinical indicators, and interventions can be sought promptly.
Keywords: early pregnancy; gestational diabetes mellitus; prediction models; risk factors.