Discovery of complex pathways from observational data

Stat Med. 2010 Aug 30;29(19):1998-2011. doi: 10.1002/sim.3962.

Abstract

Unraveling complex interactions has been a challenge in epidemiologic research. We introduce a pathway modeling framework that discovers plausible pathways from observational data, and allows estimation of both the net effect of the pathway and the types of interactions occurring among genetic or environmental risk factors. Each discovered pathway structure links combinations of observed variables through intermediate latent nodes to a final node, the outcome. Biologic knowledge can be readily applied in this framework as a prior on pathway structure to give preference to more biologically plausible models, thereby providing more precise estimation of Bayes factors for pathways of greatest interest by Markov Chain Monte Carlo (MCMC) methods.Data were simulated for binary inputs of which only a subset was involved in different pathway topologies. Our algorithm was then used to recover the pathway from the simulated data. The posterior distributions of inputs, pairwise and higher-order interactions, and topologies were obtained by MCMC methods. The evidence in favor of a particular pathway or interaction was summarized using Bayes factors. Our method can correctly identify the risk factors and interactions involved in the simulated pathway. We apply our framework to an asthma case-control data set with polymorphisms in 12 genes.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Asthma / etiology
  • Asthma / genetics
  • Bayes Theorem
  • Computational Biology / methods
  • Computer Simulation
  • Environmental Exposure / adverse effects
  • Environmental Exposure / statistics & numerical data
  • Epidemiologic Research Design
  • Genetic Predisposition to Disease / epidemiology
  • Humans
  • Logistic Models
  • Markov Chains*
  • Models, Statistical
  • Monte Carlo Method*
  • Observer Variation
  • Risk Factors