Comparative and functional characterization of intragenic tandem repeats in 10 Aspergillus genomes

Mol Biol Evol. 2009 Mar;26(3):591-602. doi: 10.1093/molbev/msn277. Epub 2008 Dec 4.

Abstract

Intragenic tandem repeats (ITRs) are consecutive repeats of three or more nucleotides found in coding regions. ITRs are the underlying cause of several human genetic diseases and have been associated with phenotypic variation, including pathogenesis, in several clades of the tree of life. We have examined the evolution and functional role of ITRs in 10 genomes spanning the fungal genus Aspergillus, a clade of relevance to medicine, agriculture, and industry. We identified several hundred ITRs in each of the species examined. ITR content varied extensively between species, with an average 79% of ITRs unique to a given species. For the fraction of conserved ITR regions, sequence comparisons within species and between close relatives revealed that they were highly variable. ITR-containing proteins were evolutionarily less conserved, compositionally distinct, and overrepresented for domains associated with cell-surface localization and function relative to the rest of the proteome. Furthermore, ITRs were preferentially found in proteins involved in transcription, cellular communication, and cell-type differentiation but were underrepresented in proteins involved in metabolism and energy. Importantly, although ITRs were evolutionarily labile, their functional associations appeared. To be remarkably conserved across eukaryotes. Fungal ITRs likely participate in a variety of developmental processes and cell-surface-associated functions, suggesting that their contribution to fungal lifestyle and evolution may be more general than previously assumed.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Aspergillus / genetics*
  • Cell Communication
  • Cell Differentiation
  • Conserved Sequence
  • Evolution, Molecular
  • Fungal Proteins / genetics*
  • Fungal Proteins / physiology
  • Genome, Fungal*
  • Metabolism
  • Proteomics
  • Tandem Repeat Sequences*
  • Transcription, Genetic

Substances

  • Fungal Proteins