Generalized Linear Models with Covariate Measurement Error and Zero-Inflated Surrogates

Mathematics (Basel). 2024 Jan;12(2):309. doi: 10.3390/math12020309. Epub 2024 Jan 17.

Abstract

Epidemiological studies often encounter a challenge due to exposure measurement error when estimating an exposure-disease association. A surrogate variable may be available for the true unobserved exposure variable. However, zero-inflated data are encountered frequently in the surrogate variables. For example, many nutrient or physical activity measures may have a zero value (or a low detectable value) among a group of individuals. In this paper, we investigate regression analysis when the observed surrogates may have zero values among some individuals of the whole study cohort. A naive regression calibration without taking into account a probability mass of the surrogate variable at 0 (or a low detectable value) will be biased. We developed a regression calibration estimator which typically can have smaller biases than the naive regression calibration estimator. We propose an expected estimating equation estimator which is consistent under the zero-inflated surrogate regression model. Extensive simulations show that the proposed estimator performs well in terms of bias correction. These methods are applied to a physical activity intervention study.

Keywords: 62E20; 62F10; 62J12; measurement error; surrogate; zero-inflated data.