A finite mixture distribution model for data collected from twins

Twin Res. 2003 Jun;6(3):235-9. doi: 10.1375/136905203765693898.

Abstract

Most analyses of data collected from a classical twin study of monozygotic (MZ) and dizygotic (DZ) twins assume that zygosity has been diagnosed without error. However, large scale surveys frequently resort to questionnaire-based methods of diagnosis which classify twins as MZ or DZ with less than perfect accuracy. This article describes a mixture distribution approach to the analysis of twin data when zygosity is not perfectly diagnosed. Estimates of diagnostic accuracy are used to weight the likelihood of the data according to the probability that any given pair is either MZ or DZ. The performance of this method is compared to fully accurate diagnosis, and to the analysis of samples that include some misclassified pairs. Conventional analysis of samples containing misclassified pairs yields biased estimates of variance components, such that additive genetic variance (A) is underestimated while common environment (C) and specific environment (E) components are overestimated. The bias is non-trivial; for 10% misclassification, true values of Additive genetic: Common environment: Specific Environment variance components of.6:.2:.2 are estimated as.48:.29:.23, respectively. The mixture distribution yields unbiased estimates, while showing relatively little loss of statistical precision for misclassification rates of 15% or less. The method is shown to perform quite well even when no information on zygosity is available, and may be applied when pair-specific estimates of zygosity probabilities are available.

Publication types

  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Humans
  • Models, Statistical*
  • Research Design
  • Twin Studies as Topic*
  • Twins*