PhyloGenie: automated phylome generation and analysis

Nucleic Acids Res. 2004 Sep 30;32(17):5231-8. doi: 10.1093/nar/gkh867. Print 2004.

Abstract

Phylogenetic reconstruction is the method of choice to determine the homologous relationships between sequences. Difficulties in producing high-quality alignments, which are the basis of good trees, and in automating the analysis of trees have unfortunately limited the use of phylogenetic reconstruction methods to individual genes or gene families. Due to the large number of sequences involved, phylogenetic analyses of proteomes preclude manual steps and therefore require a high degree of automation in sequence selection, alignment, phylogenetic inference and analysis of the resulting set of trees. We present a set of programs that automates the steps from seed sequence to phylogeny and a utility to extract all phylogenies that match specific topological constraints from a database of trees. Two example applications that show the type of questions that can be answered by phylome analysis are provided. The generation and analysis of the Thermoplasma acidophilum phylome with regard to lateral gene transfer between Thermoplasmata and Sulfolobus, showed best BLAST hits to be far less reliable indicators of lateral transfer than the corresponding protein phylogenies. The generation and analysis of the Danio rerio phylome provided more than twice as many proteins as described previously, supporting the hypothesis of an additional round of genome duplication in the actinopterygian lineage.

Publication types

  • Comparative Study
  • Evaluation Study

MeSH terms

  • Amino Acid Sequence
  • Animals
  • Gene Transfer, Horizontal
  • Genome
  • Molecular Sequence Data
  • Phylogeny*
  • Proteome / classification*
  • Proteome / genetics
  • Sequence Alignment
  • Software*
  • Sulfolobus / classification
  • Sulfolobus / genetics
  • Thermoplasma / classification
  • Thermoplasma / genetics
  • Zebrafish / genetics

Substances

  • Proteome