A marginal mixture model for selecting differentially expressed genes across two types of tissue samples

Int J Biostat. 2008 Oct 9;4(1):Article 20. doi: 10.2202/1557-4679.1093.

Abstract

Bayesian hierarchical models that characterize the distributions of (transformed) gene profiles have been proven very useful and flexible in selecting differentially expressed genes across different types of tissue samples (e.g. Lo and Gottardo, 2007). However, the marginal mean and variance of these models are assumed to be the same for different gene clusters and for different tissue types. Moreover, it is not easy to determine which of the many competing Bayesian hierarchical models provides the best fit for a specific microarray data set. To address these two issues, we propose a marginal mixture model that directly models the marginal distribution of transformed gene profiles. Specifically, we approximate the marginal distributions of transformed gene profiles via a mixture of three-component multivariate Normal distributions, each component of which has the same structures of marginal mean vector and covariance matrix as those for Bayesian hierarchical models, but the values can differ. Based on the proposed model, a method is derived to select genes differentially expressed across two types of tissue samples. The derived gene selection method performs well on a real microarray data set and consistently has the best performance (based on class agreement indices) compared with several other gene selection methods on simulated microarray data sets generated from three different mixture models.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms
  • Bayes Theorem
  • Biostatistics / methods*
  • Humans
  • Leukemia, Myeloid, Acute / genetics
  • Models, Statistical*
  • Multigene Family
  • Oligonucleotide Array Sequence Analysis / statistics & numerical data
  • Precursor Cell Lymphoblastic Leukemia-Lymphoma / genetics
  • Transcriptome / statistics & numerical data*