Confidence intervals for candidate gene effects and environmental factors in population-based association studies of families

Ann Hum Genet. 2007 Jul;71(Pt 4):421-32. doi: 10.1111/j.1469-1809.2007.00350.x. Epub 2007 Mar 7.

Abstract

Complex diseases are influenced by both genetic and environmental factors. Studies of individuals or of families can be used to examine the association of genetic factors, such as candidate genes, and other risk factors with the presence or absence of complex disorders. If families are investigated, whether or not they are randomly ascertained, possible familial correlation among observations must be considered. We have compared two statistical approaches for analyzing correlated binary data from randomly ascertained nuclear families. The generalized estimating equations approach (GEE) can be used to adjust for familial correlation. The relationship between covariates and the response is modelled, and the correlations among family members are treated as nuisance parameters. For comparison, we have proposed two strategies from a hierarchical nonparametric bootstrap approach. One strategy (S1) samples family units, preserving the structure and correlation within each family. A second and novel strategy (S2) also samples family units but then randomly samples offspring with replacement in each family. We applied the methods to data from a study of cardiovascular disease, and followed up with a simulation study in which family data were generated from an underlying multifactorial genetic model. Although the bootstrap approach was more computationally demanding, it outperformed the GEE in terms of confidence interval coverage probabilities for all sample sizes considered.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computer Simulation*
  • Environment*
  • Gene Frequency
  • Genotype
  • Humans
  • Models, Genetic
  • Models, Statistical*
  • Pedigree*
  • Polymorphism, Genetic