HIV Haplotype Inference Using a Propagating Dirichlet Process Mixture Model

IEEE/ACM Trans Comput Biol Bioinform. 2014 Jan-Feb;11(1):182-91. doi: 10.1109/TCBB.2013.145.

Abstract

This paper presents a new computational technique for the identification of HIV haplotypes. HIV tends to generate many potentially drug-resistant mutants within the HIV-infected patient and being able to identify these different mutants is important for efficient drug administration. With the view of identifying the mutants, we aim at analyzing short deep sequencing data called reads. From a statistical perspective, the analysis of such data can be regarded as a nonstandard clustering problem due to missing pairwise similarity measures between non-overlapping reads. To overcome this problem we propagate a Dirichlet Process Mixture Model by sequentially updating the prior information from successive local analyses. The model is verified using both simulated and real sequencing data.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computational Biology / methods*
  • HIV Infections / virology*
  • HIV-1 / genetics*
  • Haplotypes / genetics*
  • High-Throughput Nucleotide Sequencing / methods
  • Humans
  • Sequence Analysis, DNA / methods