The Missing Expression Level-Evolutionary Rate Anticorrelation in Viruses Does Not Support Protein Function as a Main Constraint on Sequence Evolution

Genome Biol Evol. 2021 Apr 5;13(4):evab049. doi: 10.1093/gbe/evab049.

Abstract

One of the central goals in molecular evolutionary biology is to determine the sources of variation in the rate of sequence evolution among proteins. Gene expression level is widely accepted as the primary determinant of protein evolutionary rate, because it scales with the extent of selective constraints imposed on a protein, leading to the well-known negative correlation between expression level and protein evolutionary rate (the E-R anticorrelation). Selective constraints have been hypothesized to entail the maintenance of protein function, the avoidance of cytotoxicity caused by protein misfolding or nonspecific protein-protein interactions, or both. However, empirical tests evaluating the relative importance of these hypotheses remain scarce, likely due to the nontrivial difficulties in distinguishing the effect of a deleterious mutation on a protein's function versus its cytotoxicity. We realized that examining the sequence evolution of viral proteins could overcome this hurdle. It is because purifying selection against mutations in a viral protein that result in cytotoxicity per se is likely relaxed, whereas purifying selection against mutations that impair viral protein function persists. Multiple analyses of SARS-CoV-2 and nine other virus species revealed a complete absence of any E-R anticorrelation. As a control, the E-R anticorrelation does exist in human endogenous retroviruses where purifying selection against cytotoxicity is present. Taken together, these observations do not support the maintenance of protein function as the main constraint on protein sequence evolution in cellular organisms.

Keywords: avoidance of cytotoxicity; gene expression level; maintenance of protein function; protein evolutionary rate; protein homeostasis; the E–R anticorrelation.

MeSH terms

  • Amino Acid Sequence
  • Animals
  • Endogenous Retroviruses / genetics*
  • Evolution, Molecular*
  • Humans
  • Middle East Respiratory Syndrome Coronavirus / genetics
  • Mutation
  • SARS-CoV-2 / genetics*
  • Sequence Analysis, RNA
  • Viral Proteins / genetics*

Substances

  • Viral Proteins