A comparison of multiple imputation strategies for handling missing data in multi-item scales: Guidance for longitudinal studies

Stat Med. 2021 Sep 20;40(21):4660-4674. doi: 10.1002/sim.9088. Epub 2021 Jun 8.

Abstract

Medical research often involves using multi-item scales to assess individual characteristics, disease severity, and other health-related outcomes. It is common to observe missing data in the scale scores, due to missing data in one or more items that make up that score. Multiple imputation (MI) is a popular method for handling missing data. However, it is not clear how best to use MI in the context of scale scores, particularly when they are assessed at multiple waves of data collection resulting in large numbers of items. The aim of this article is to provide practical advice on how to impute missing values in a repeatedly measured multi-item scale using MI when inference on the scale score is of interest. We evaluated the performance of five MI strategies for imputing missing data at either the item or scale level using simulated data and a case study based on four waves of the Longitudinal Study of Australian Children (LSAC). MI was implemented using both multivariate normal imputation and fully conditional specification, with two rules for calculating the scale score. A complete case analysis was also performed for comparison. Based on our results, we caution against the use of a MI strategy that does not include the scale score in the imputation model(s) when the scale score is required for analysis.

Keywords: longitudinal study; missing data; multi-item scale; multiple imputation.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Australia
  • Child
  • Computer Simulation
  • Data Collection
  • Humans
  • Longitudinal Studies
  • Research Design*