Localized structural frustration for evaluating the impact of sequence variants

Nucleic Acids Res. 2016 Dec 1;44(21):10062-10073. doi: 10.1093/nar/gkw927. Epub 2016 Oct 18.

Abstract

Population-scale sequencing is increasingly uncovering large numbers of rare single-nucleotide variants (SNVs) in coding regions of the genome. The rarity of these variants makes it challenging to evaluate their deleteriousness with conventional phenotype-genotype associations. Protein structures provide a way of addressing this challenge. Previous efforts have focused on globally quantifying the impact of SNVs on protein stability. However, local perturbations may severely impact protein functionality without strongly disrupting global stability (e.g. in relation to catalysis or allostery). Here, we describe a workflow in which localized frustration, quantifying unfavorable local interactions, is employed as a metric to investigate such effects. Using this workflow on the Protein Databank, we find that frustration produces many immediately intuitive results: for instance, disease-related SNVs create stronger changes in localized frustration than non-disease related variants, and rare SNVs tend to disrupt local interactions to a larger extent than common variants. Less obviously, we observe that somatic SNVs associated with oncogenes and tumor suppressor genes (TSGs) induce very different changes in frustration. In particular, those associated with TSGs change the frustration more in the core than the surface (by introducing loss-of-function events), whereas those associated with oncogenes manifest the opposite pattern, creating gain-of-function events.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, N.I.H., Extramural

MeSH terms

  • Computational Biology / methods*
  • Databases, Protein
  • Evolution, Molecular
  • Genes, Tumor Suppressor
  • Genetic Variation*
  • Humans
  • Neoplasms / genetics
  • Neoplasms / pathology
  • Oncogenes*
  • Polymorphism, Single Nucleotide
  • Proteins / chemistry*
  • Proteins / genetics
  • Proteins / metabolism
  • Workflow

Substances

  • Proteins