Evolutionarily conserved substrate substructures for automated annotation of enzyme superfamilies

PLoS Comput Biol. 2008 Aug 1;4(8):e1000142. doi: 10.1371/journal.pcbi.1000142.

Abstract

The evolution of enzymes affects how well a species can adapt to new environmental conditions. During enzyme evolution, certain aspects of molecular function are conserved while other aspects can vary. Aspects of function that are more difficult to change or that need to be reused in multiple contexts are often conserved, while those that vary may indicate functions that are more easily changed or that are no longer required. In analogy to the study of conservation patterns in enzyme sequences and structures, we have examined the patterns of conservation and variation in enzyme function by analyzing graph isomorphisms among enzyme substrates of a large number of enzyme superfamilies. This systematic analysis of substrate substructures establishes the conservation patterns that typify individual superfamilies. Specifically, we determined the chemical substructures that are conserved among all known substrates of a superfamily and the substructures that are reacting in these substrates and then examined the relationship between the two. Across the 42 superfamilies that were analyzed, substantial variation was found in how much of the conserved substructure is reacting, suggesting that superfamilies may not be easily grouped into discrete and separable categories. Instead, our results suggest that many superfamilies may need to be treated individually for analyses of evolution, function prediction, and guiding enzyme engineering strategies. Annotating superfamilies with these conserved and reacting substructure patterns provides information that is orthogonal to information provided by studies of conservation in superfamily sequences and structures, thereby improving the precision with which we can predict the functions of enzymes of unknown function and direct studies in enzyme engineering. Because the method is automated, it is suitable for large-scale characterization and comparison of fundamental functional capabilities of both characterized and uncharacterized enzyme superfamilies.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence / genetics
  • Animals
  • Catalysis
  • Cluster Analysis
  • Computational Biology / methods*
  • Conserved Sequence / genetics
  • Databases, Protein
  • Enzymes / chemistry*
  • Enzymes / genetics
  • Enzymes / metabolism
  • Evolution, Molecular*
  • Humans
  • Pattern Recognition, Automated / methods*
  • Protein Binding / genetics
  • Protein Interaction Domains and Motifs / genetics
  • Sequence Alignment
  • Structural Homology, Protein
  • Structure-Activity Relationship
  • Substrate Specificity / genetics*

Substances

  • Enzymes