Simple sequence repeats in proteins and their significance for network evolution

Gene. 2005 Jan 17;345(1):113-8. doi: 10.1016/j.gene.2004.11.023. Epub 2004 Dec 15.

Abstract

Only 5-6% of mammalian genomes are genes; the remainders are made up primarily of transposable elements and different types of simple sequence repeat (SSRs) (micro- and minisatellites and cryptic repeats), which tend to accumulate in organisms with larger genomes. SSRs are also found at the level of protein sequences and may or may not be encoded by SSRs at the DNA sequence level. Studies of proteins containing SSRs indicate that they tend to belong to particular functional classes, particularly transcription factors and protein kinases. Protein SSRs coded for by pure codon repeats evolve rapidly while those encoded by mixtures of codons evolve slowly. We outline a conceptualization of how protein SSRs may arise and become fixed in proteins during evolution, and suggest that emergence and change in length of protein SSRs may affect the topology of protein interaction networks.

Publication types

  • Review

MeSH terms

  • Amino Acid Sequence
  • Animals
  • Codon / genetics
  • Evolution, Molecular*
  • Humans
  • Microsatellite Repeats / genetics*
  • Models, Genetic
  • Molecular Sequence Data
  • Proteins / genetics*
  • Repetitive Sequences, Amino Acid / genetics*
  • Trinucleotide Repeats / genetics

Substances

  • Codon
  • Proteins