A prognostic model is well calibrated when it accurately predicts event rates. This is first determined by testing for goodness of fit with the development dataset. All existing tests and graphic tools designed for the purpose suffer several drawbacks, related mainly to the subgrouping of observations or to heavy dependence on arbitrary parameters. We propose a statistical test and a graphical method to assess the goodness of fit of logistic regression models, obtained through an extension of similar techniques developed for external validation. We analytically computed and numerically verified the distribution of the underlying statistic. Simulations on a set of realistic scenarios show that this test and the well-known Hosmer-Lemeshow approach have similar type I error rates. The main advantage of this new approach is that the relationship between model predictions and outcome rates across the range of probabilities can be represented in the calibration belt plot, together with its statistical confidence. By readily spotting any deviations from the perfect fit, this new graphical tool is designed to identify, during the process of model development, poorly modeled variables that call for further investigation. This is illustrated through an example based on real data.
Keywords: calibration test; dichotomous outcome models; goodness of fit; graphical methods; logistic regression models.
Copyright © 2015 John Wiley & Sons, Ltd.