Combining models of protein translation and population genetics to predict protein production rates from codon usage patterns

Mol Biol Evol. 2007 Nov;24(11):2362-72. doi: 10.1093/molbev/msm169. Epub 2007 Aug 16.

Abstract

Genes are often biased in their codon usage. The degree of bias displayed often changes with expression level and intragenic position. Numerous indices, such as the codon adaptation index, have been developed to measure this bias. Although the expression level of a gene and index values are correlated, the heuristic nature of these metrics limits their ability to explain this relationship. As an alternative approach, this study integrates mechanistic models of cellular and population processes in a nested manner to develop a stochastic evolutionary model of a protein's production rate (SEMPPR). SEMPPR assumes that the evolution of codon bias is driven by selection to reduce the cost of nonsense errors and that this selection is counteracted by mutation and drift. Through the application of Bayes' theorem, SEMPPR generates a posterior probability distribution for the protein production rate of a given gene. Conceptually, SEMPPR's predictions are based on the degree of adaptation to reduce the cost of nonsense errors observed in the codon usage pattern of the gene. As an illustration, SEMPPR was parameterized using the Saccharomyces cerevisiae genome and its predictions tested using available empirical data. The results indicate that SEMPPR's predictions are as reliable index based ones. In addition, SEMPPR's output is more easily interpreted and its predictions could be improved through refinements of the models upon which it is built.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Bayes Theorem
  • Codon / genetics*
  • Codon, Nonsense
  • Genetics, Population / methods*
  • Models, Genetic*
  • Protein Biosynthesis / genetics*
  • Reproducibility of Results

Substances

  • Codon
  • Codon, Nonsense