Objectives: To investigate whether a complete case logistic regression gives a biased estimate of the exposure odds ratio (OR) if missingness depends on a continuous outcome, but a binary version is used for analysis; to examine whether any bias could be reduced by including a misclassified form of the incomplete outcome as an auxiliary variable in multiple imputation (MI).
Study design and setting: Analytical investigation, simulation study, and data from a UK cohort.
Results: There was bias in the exposure OR when the probability of being a complete case was independently associated with the exposure and (continuous) outcome but this was generally small unless the association with the outcome was strong. Where exposure and (continuous) outcome interacted in their effect on this probability, the bias was large, particularly at high levels of missing data. Inclusion of the auxiliary variable resulted in important bias reductions when this had high sensitivity and specificity.
Conclusion: The robustness of logistic regression to missing data is not maintained when the outcome is a binary version of an underlying continuous measure, but the bias will be small unless the association between the continuous outcome and missingness is strong.
Keywords: ALSPAC; Auxiliary variable; Complete case analysis; Logistic regression; Missing data; Multiple imputation.
Copyright © 2022 The Author(s). Published by Elsevier Inc. All rights reserved.