Objective: To compare the accuracy of five different classification systems for interpreting electronic fetal monitoring (EFM) when predicting neonatal status at birth, as determined by the umbilical cord arterial pH.
Methods: Ninety-seven cardiotocography traces were retrospectively interpreted according to five classification systems for EFM: Dublin Fetal Heart Rate Monitoring Trial (DFHRMT), Royal College of Obstetricians and Gynecologists (RCOG), Society of Obstetricians and Gynaecologists of Canada (SOGC), National Institute of Child Health and Human Development (NICHD) and Parer & Ikeda's. For each classification system, sensitivity, specificity, positive and negative predictive values were calculated. The capacity of the classifications to predict neonatal pH was also evaluated by receiver-operating characteristic (ROC) curves. Agreement between the five systems was estimated using weighted kappa statistic.
Results: Considering pH ≤7.15 as the cutoff for low pH, the sensitivity and specificity values were 100 and 18% (DFHRMT); 100 and 15% (RCOG); 88 and 37% (SOGC); 67 and 92% (NICHD); 55 and 67% (Parer & Ikeda). The ROC curves showed that all classifications analyzed had a low discriminative capacity when predicting umbilical artery pH ≤7.15. An excellent agreement was observed between DFHRMT and RCOG (weighted κ value: 0.860).
Conclusions: Parer & Ikeda and NICHD classifications had the highest specificity in detecting umbilical cord arterial pH ≤7.15. The high specificity of the NICHD classification is hindered by a high percentage of "intermediate" traces (80%). Parer & Ikeda classification is the one that best classify as pathological only the traces of fetuses that are truly at risk of acidemia, thus avoiding unnecessary intervention. It also showed the best trade-off between sensitivity and specificity and the lowest rate of traces considered "intermediate."