A finite mixture model for X-chromosome association with an emphasis on microbiome data analysis

Genet Epidemiol. 2019 Jun;43(4):427-439. doi: 10.1002/gepi.22190. Epub 2019 Jan 18.

Abstract

Analysis of the X chromosome has been largely neglected in genetic studies mainly because of complex underlying biological mechanisms. On the other hand, the study of human microbiome data (typically over-dispersed counts with an excess of zeros) has generated great interest recently because of advancements in next-generation sequencing technologies. We propose a novel approach to infer the association between host genetic variants in the X-chromosome and microbiome data. The method accounts for random X-chromosome inactivation (XCI), skewed (or nonrandom) XCI (XCI-S), and escape of XCI (XCI-E). The inference is performed through a finite mixture model (FMM), in which an indicator variable denoting the "true" biological mechanism is treated as missing data. An expectation-maximization algorithm on zero-inflated and two-part models is implemented to estimate genetic effects. We investigate the performance of the FMM along with strategies that assume XCI and XCI-E mechanisms for all subjects compared with alternative approaches. Briefly, an XCI mechanism codes males' genotypes as homozygous females, whereas under XCI-E, males are treated as heterozygous females. By comprehensive simulations, we evaluate tests of the hypothesis under a computationally efficient score statistic. In summary, the FMM renders reduced bias and commensurate power compared to XCI, XCI-E, and alternative strategies while maintaining adequate Type 1 error control. The proposed method has far-reaching applications. In particular, we illustrate its usage on a large-scale human microbiome study, the Genetic, Environmental and Microbial (GEM) project, to test for the genetic association on the X chromosome.

Keywords: X-chromosome association; finite mixture models; microbiome data; random/escape/skewed X-chromosome inactivation.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Chromosomes, Human, X* / genetics
  • Cohort Studies
  • Crohn Disease / epidemiology
  • Crohn Disease / genetics
  • Crohn Disease / microbiology
  • Data Analysis
  • Family
  • Female
  • Finite Element Analysis
  • Gene-Environment Interaction
  • Genetic Association Studies / statistics & numerical data
  • Genotype
  • Heterozygote
  • High-Throughput Nucleotide Sequencing
  • Humans
  • Male
  • Microbiota / genetics*
  • Models, Genetic
  • X Chromosome Inactivation