A mixture model approach in gene-gene and gene-environmental interactions for binary phenotypes

J Biopharm Stat. 2008;18(6):1150-77. doi: 10.1080/10543400802369038.

Abstract

In translational research, a genetic association study of a binary outcome has a twofold aim: test whether genetic/environmental variables or their combinations are associated with a clinical phenotype, and determine how those combinations are grouped to predict the phenotype (i.e., which combinations have a similarly distributed phenotype, and which ones have differently distributed phenotypes). The second part of this aim has high clinical appeal, because it can directly facilitate clinical decisions. Although traditional logistic regression can detect gene-gene or gene-environmental interaction effects on binary phenotypes, they cannot decisively determine how genotype combinations are grouped to predict the phenotype. Our proposed mixture model approach is valuable in this context. It concurrently detects main and interaction effects of genetic and environmental variables through a likelihood ratio test (LRT) and conducts phenotype cluster analysis based on genetic and environmental variable combinations. The theoretical distribution of the proposed mixture model's likelihood ratio test is robust not only to small sample size but also to unequal sample size in various genotype and environmental subgroups. Hypothesis testing through a likelihood ratio test results in a fast algorithm for p-value calculations. Extensive simulation studies demonstrate that mixture model, overall test in logistic regression, and Monte Carlo based logic regression constantly possess the best power to detect multi-way gene/environmental combinations. The mixture model approach has the highest recovery probability to recover the true partition in the simulation studies. Its applications are exemplified in interim data analyses for two cancer studies.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms
  • Antineoplastic Agents / pharmacokinetics
  • Antineoplastic Agents / therapeutic use
  • Breast Neoplasms / drug therapy
  • Breast Neoplasms / genetics
  • Breast Neoplasms / metabolism
  • Cluster Analysis
  • Computer Simulation
  • Docetaxel
  • Environment*
  • Female
  • Gene Expression Regulation, Neoplastic
  • Genotype*
  • Humans
  • Likelihood Functions
  • Logistic Models
  • Male
  • Models, Genetic*
  • Models, Statistical*
  • Monte Carlo Method
  • Pharmacogenetics / statistics & numerical data*
  • Phenotype*
  • Prostatic Neoplasms / drug therapy
  • Prostatic Neoplasms / genetics
  • Prostatic Neoplasms / metabolism
  • Reproducibility of Results
  • Tamoxifen / adverse effects
  • Tamoxifen / pharmacokinetics
  • Taxoids / pharmacokinetics
  • Taxoids / therapeutic use
  • Treatment Outcome

Substances

  • Antineoplastic Agents
  • Taxoids
  • Tamoxifen
  • Docetaxel