Purpose: Epidemiologic studies often conflate the strength of association with predictive accuracy and build classification models based on arbitrarily selected probability cutoffs without considering the cost of misclassification. We illustrated these common pitfalls by building association, prediction, and classification models using birthweight as an exposure and child mortality and child anthropometric failure as outcomes.
Methods: Nationally representative samples of 188,819 and 164,113 children aged less than 5 years across India were used for our analysis of mortality and anthropometric failure, respectively. We assessed outcomes of neonatal, postneonatal, and child mortality as well as stunting, wasting, and underweight. Birthweight was the main exposure. We used adjusted and unadjusted logistic regression models to evaluate association strength, univariable and multivariable logistic regression models trained on 80% of the data using 10-fold cross-validation to evaluate predictive power, and classification models across a series of possible misclassification cost scenarios to evaluate classification accuracy.
Results: Birthweight was strongly associated with five of six outcomes (P < .001), and associations were robust to covariate adjustment. Prediction models evaluated on the test set showed that birthweight was a poor discriminator of all outcomes (area under the curve < 0.62), and that adding birthweight to a multivariable model did not meaningfully improve discrimination. Prediction models for anthropometric failure showed substantially better calibration than prediction models for mortality. Depending on the ratio of false positive (FP) cost to false negative (FN) cost, the probability cutoff that minimized total misclassification cost ranged from 0.116 (cost ratio = 7:93) to 0.706 (cost ratio = 4:1), TPR ranged from 0.999 to 0.004, and PPV ranged from 0.355 to 0.867..
Conclusions: Although birthweight is strongly associated with mortality and anthropometric failure, it is a poor predictor of child health outcomes, highlighting that strong associations do not imply predictive power. We recommend that (1) future research focus on building predictive models for anthropometric failure given their clinical relevance in diagnosing individual cases, and that (2) studies that build classifiers report performance metrics across a range of cutoffs to account for variation in the cost of FPs and FNs.
Keywords: Anthropometric failure; Association; Birthweight; Child health; Classification; Mortality; Odds ratio; Prediction; Stunting; Underweight; Wasting.
Copyright © 2020 Elsevier Inc. All rights reserved.