When loss-of-function is loss of function: assessing mutational signatures and impact of loss-of-function genetic variants

Bioinformatics. 2017 Jul 15;33(14):i389-i398. doi: 10.1093/bioinformatics/btx272.

Abstract

Motivation: Loss-of-function genetic variants are frequently associated with severe clinical phenotypes, yet many are present in the genomes of healthy individuals. The available methods to assess the impact of these variants rely primarily upon evolutionary conservation with little to no consideration of the structural and functional implications for the protein. They further do not provide information to the user regarding specific molecular alterations potentially causative of disease.

Results: To address this, we investigate protein features underlying loss-of-function genetic variation and develop a machine learning method, MutPred-LOF, for the discrimination of pathogenic and tolerated variants that can also generate hypotheses on specific molecular events disrupted by the variant. We investigate a large set of human variants derived from the Human Gene Mutation Database, ClinVar and the Exome Aggregation Consortium. Our prediction method shows an area under the Receiver Operating Characteristic curve of 0.85 for all loss-of-function variants and 0.75 for proteins in which both pathogenic and neutral variants have been observed. We applied MutPred-LOF to a set of 1142 de novo vari3ants from neurodevelopmental disorders and find enrichment of pathogenic variants in affected individuals. Overall, our results highlight the potential of computational tools to elucidate causal mechanisms underlying loss of protein function in loss-of-function variants.

Availability and implementation: http://mutpred.mutdb.org.

Contact: predrag@indiana.edu.

MeSH terms

  • Computational Biology / methods
  • Humans
  • Loss of Function Mutation*
  • Machine Learning*
  • Protein Conformation
  • Proteins / genetics*
  • Proteins / metabolism
  • Proteins / physiology
  • Sequence Analysis, Protein / methods*
  • Software*

Substances

  • Proteins