Computing the polytomous discrimination index

Stat Med. 2021 Jul 20;40(16):3667-3681. doi: 10.1002/sim.8991. Epub 2021 Apr 18.

Abstract

Polytomous regression models generalize logistic models for the case of a categorical outcome variable with more than two distinct categories. These models are currently used in clinical research, and it is essential to measure their abilities to distinguish between the categories of the outcome. In 2012, van Calster et al proposed the polytomous discrimination index (PDI) as an extension of the binary discrimination c-statistic to unordered polytomous regression. The PDI is a summary of the simultaneous discrimination between all outcome categories. Previous implementations of the PDI are not capable of running on "big data." This article shows that the PDI formula can be manipulated to depend only on the distributions of the predicted probabilities evaluated for each outcome category and within each observed level of the outcome, which substantially improves the computation time. We present a SAS macro and R function that can rapidly evaluate the PDI and its components. The routines are evaluated on several simulated datasets after varying the number of categories of the outcome and size of the data and two real-world large administrative health datasets. We compare PDI with two other discrimination indices: M-index and hypervolume under the manifold (HUM) on simulated examples. We describe situations where the PDI and HUM, indices based on multiple comparisons, are superior to the M-index, an index based on pairwise comparisons, to detect predictions that are no different than random selection or erroneous due to incorrect ranking.

Keywords: R function; SAS macro; discrimination; polytomous discrimination index; polytomous regression.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Humans
  • Logistic Models*

Grants and funding