Leveraging protein dynamics to identify cancer mutational hotspots using 3D structures

Proc Natl Acad Sci U S A. 2019 Sep 17;116(38):18962-18970. doi: 10.1073/pnas.1901156116. Epub 2019 Aug 28.

Abstract

Large-scale exome sequencing of tumors has enabled the identification of cancer drivers using recurrence-based approaches. Some of these methods also employ 3D protein structures to identify mutational hotspots in cancer-associated genes. In determining such mutational clusters in structures, existing approaches overlook protein dynamics, despite its essential role in protein function. We present a framework to identify cancer driver genes using a dynamics-based search of mutational hotspot communities. Mutations are mapped to protein structures, which are partitioned into distinct residue communities. These communities are identified in a framework where residue-residue contact edges are weighted by correlated motions (as inferred by dynamics-based models). We then search for signals of positive selection among these residue communities to identify putative driver genes, while applying our method to the TCGA (The Cancer Genome Atlas) PanCancer Atlas missense mutation catalog. Overall, we predict 1 or more mutational hotspots within the resolved structures of proteins encoded by 434 genes. These genes were enriched among biological processes associated with tumor progression. Additionally, a comparison between our approach and existing cancer hotspot detection methods using structural data suggests that including protein dynamics significantly increases the sensitivity of driver detection.

Keywords: PanCancer; TCGA; cancer driver; hotspot communities; protein dynamics.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computational Biology / methods*
  • Databases, Genetic
  • Exome / genetics
  • Genomics / methods*
  • Humans
  • Mutation
  • Neoplasm Proteins / chemistry*
  • Neoplasm Proteins / genetics*
  • Neoplasms / genetics*
  • Protein Conformation
  • Reproducibility of Results
  • Workflow

Substances

  • Neoplasm Proteins