Epidemiological studies of Alzheimer's disease and dementia are often two-phase studies including a screening phase and a clinical assessment phase. It is common to interview a relative of the subject at each of these phases to obtain information about the subject's exposure to risk factors. This can result in a misclassification error when assessing risk factors, as the two responses of the relative often differ. This is especially a problem for risk factors involving life-style and family history which cannot be confirmed using the subject's medical records. A naive analysis using data from each phase separately would give two different estimates of the odds ratio; both estimates could be biased. In this paper, we extend the estimation methods adjusting for misclassification developed by Liu and Liang to data collected through two-phase sampling. We first use a latent class analysis and the EM algorithm to estimate the misclassification parameters. We then derive the maximum pseudo-likelihood estimators, conditional on the misclassification parameters, to estimate the odds ratios accounting for the complex sampling study design. We propose to use the jack-knife estimator for estimation of the variances. We apply the above method to data collected in the Indianapolis-Ibadan Dementia Study to estimate the odds ratio for smoking adjusting for misclassification error.
Copyright 2000 John Wiley & Sons, Ltd.