The sandwich variance estimator of generalized estimating equations (GEE) may not perform well when the number of independent clusters is small. This could jeopardize the validity of the robust Wald test by causing inflated type I error and lower coverage probability of the corresponding confidence interval than the nominal level. Here, we investigate the small-sample performance of the robust score test for correlated data and propose several modifications to improve the performance. In a simulation study, we compare the robust score test to the robust Wald test for correlated Bernoulli and Poisson data, respectively. It is confirmed that the robust Wald test is too liberal whereas the robust score test is too conservative for small samples. To explain this puzzling operating difference between the two tests, we consider their applications to two special cases, one-sample and two-sample comparisons, thus motivating some modifications to the robust score test. A modification based on a simple adjustment to the usual robust score statistic by a factor of J/(J - 1) (where J is the number of clusters) reduces the conservativeness of the generalized score test. Simulation studies mimicking group-randomized clinical trials with binary and count responses indicated that it may improve the small-sample performance over that of the generalized score and Wald tests with test size closer to the nominal level. Finally, we demonstrate the utility of our proposal by applying it to a group-randomized clinical trial, trying alternative cafeteria options in schools (TACOS).
Copyright (c) 2005 John Wiley & Sons, Ltd.