TB-Lineage: an online tool for classification and analysis of strains of Mycobacterium tuberculosis complex

Infect Genet Evol. 2012 Jun;12(4):789-97. doi: 10.1016/j.meegid.2012.02.010. Epub 2012 Mar 3.

Abstract

This paper formulates a set of rules to classify genotypes of the Mycobacterium tuberculosis complex (MTBC) into major lineages using spoligotypes and MIRU-VNTR results. The rules synthesize prior literature that characterizes lineages by spacer deletions and variations in the number of repeats seen at locus MIRU24 (alias VNTR2687). A tool that efficiently and accurately implements this rule base is now freely available at http://tbinsight.cs.rpi.edu/run_tb_lineage.html. When MIRU24 data is not available, the system utilizes predictions made by a Naïve Bayes classifier based on spoligotype data. This website also provides a tool to generate spoligoforests in order to visualize the genetic diversity and relatedness of genotypes and their associated lineages. A detailed analysis of the application of these tools on a dataset collected by the CDC consisting of 3198 distinct spoligotypes and 5430 distinct MIRU-VNTR types from 37,066 clinical isolates is presented. The tools were also tested on four other independent datasets. The accuracy of automated classification using both spoligotypes and MIRU24 is >99%, and using spoligotypes alone is >95%. This online rule-based classification technique in conjunction with genotype visualization provides a practical tool that supports surveillance of TB transmission trends and molecular epidemiological studies.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Bacterial Typing Techniques
  • Computational Biology / methods
  • DNA, Bacterial
  • Genotype
  • Humans
  • Internet
  • Minisatellite Repeats
  • Mycobacterium tuberculosis / classification*
  • Mycobacterium tuberculosis / genetics*
  • Phylogeny
  • Software*
  • Tuberculosis / epidemiology
  • Tuberculosis / transmission

Substances

  • DNA, Bacterial