Considerations for subgroups and phenocopies in complex disease genetics

PLoS One. 2013 Aug 20;8(8):e71614. doi: 10.1371/journal.pone.0071614. eCollection 2013.

Abstract

The number of identified genetic variants associated to complex disease cannot fully explain heritability. This may be partially due to more complicated patterns of predisposition than previously suspected. Diseases such as multiple sclerosis (MS) may consist of multiple disease causing mechanisms, each comprised of several elements. We describe how the effect of subgroups can be calculated using the standard association measurement odds ratio, which is then manipulated to provide a formula for the true underlying association present within the subgroup. This is sensitive to the initial minor allele frequencies present in both cases and the subgroup of patients. The methodology is then extended to the χ(2) statistic, for two related scenarios. First, to determine the true χ(2) when phenocopies or disease subtypes reduce association and are reclassified as controls when calculating statistics. Here, the χ(2) is given by (1 + σ * (a + b)/(c + d))/(1 - σ), or (1 + σ)/(1 - σ) for equal numbers of cases and controls. Second, when subgroups corresponding to heterogeneity mask the true effect size, but no reclassification is made. Here, the proportion increase in total sample size required to attain the same χ(2) statistic as the subgroup is given as γ = (1 - σ/2)/((1 - σ)(1 - σc/(a + c))(1 - σd/(b + d))), and a python script to calculate and plot this value is provided at kirc.se. Practical examples show how in a study of modest size (1000 cases and 1000 controls), a non-significant SNP may exceed genome-wide significance when corresponding to a subgroup of 20% of cases, and may occur in heterozygous form in all cases. This methodology may explain the modest association found in diseases such as MS wherein heterogeneity confounds straightforward measurement of association.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Alleles
  • Chi-Square Distribution
  • Disease / genetics*
  • Gene Frequency
  • Genetic Heterogeneity
  • Genetic Predisposition to Disease*
  • Humans
  • Models, Genetic
  • Odds Ratio
  • Phenotype
  • Sample Size

Grants and funding

This work was supported by grants from the Karolinska Institutets stiftelse and the Swedish Research Council. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.