A statistical framework for haplotype block inference

J Bioinform Comput Biol. 2005 Oct;3(5):1021-38. doi: 10.1142/s021972000500151x.

Abstract

The existence of haplotype blocks transmitted from parents to offspring has been suggested recently. This has created an interest in the inference of the block structure and length. The motivation is that haplotype blocks that are characterized well will make it relatively easier to quickly map all the genes carrying human diseases. To study the inference of haplotype block systematically, we propose a statistical framework. In this framework, the optimal haplotype block partitioning is formulated as the problem of statistical model selection; missing data can be handled in a standard statistical way; population strata can be implemented; block structure inference/hypothesis testing can be performed; prior knowledge, if present, can be incorporated to perform a Bayesian inference. The algorithm is linear in the number of loci, instead of NP-hard for many such algorithms. We illustrate the applications of our method to both simulated and real data sets.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms
  • Bayes Theorem
  • Chromosome Mapping / methods*
  • DNA Mutational Analysis / methods*
  • Data Interpretation, Statistical
  • Expressed Sequence Tags
  • Genetic Variation / genetics*
  • Genome, Human
  • Haplotypes / genetics*
  • Humans
  • Likelihood Functions
  • Models, Genetic*
  • Models, Statistical
  • Polymorphism, Single Nucleotide / genetics*
  • Selection, Genetic
  • Sequence Analysis, DNA / methods*