Using reference databases of genetic variation to evaluate the potential pathogenicity of candidate disease variants

Hum Mutat. 2013 Jun;34(6):836-41. doi: 10.1002/humu.22303. Epub 2013 Mar 26.

Abstract

The potential pathogenicity of genetic variants identified in disease-based resequencing studies is often overlooked where variants have previously been reported in dbSNP, the 1000 genomes project, or the National Heart, Lung and Blood Institute Exome Sequencing Project (ESP). In this work, we estimate that collectively, these databases capture ∼52% of mutations (dbSNP 50.4%; 1000 genomes 4.8%; and ESP 10.2%) reported as disease causing within phenotype-based locus-specific databases (LSDBs). To investigate whether these mutations may simply represent benign population variants, we evaluated whether the carrier frequencies associated with mutations implicated in amyotrophic lateral sclerosis were higher than what could be accounted for by high-penetrance disease models. In doing so, we have questioned the veracity of 51 mutations, but also demonstrated that each of the three databases included credible disease variants. Our results demonstrate the benefits of using databases such as dbSNP, the 1000 genomes project, and the ESP to evaluate the pathogenicity of putative disease variants, and suggest that many disease mutations reported across LSDBs may not actually be pathogenic. However, they also demonstrate that even in the context of rare Mendelian disorders, the potential pathogenicity of variants reported by these databases should not be overlooked without proper evaluation.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Chromosome Mapping
  • Computational Biology* / methods
  • Databases, Genetic*
  • Genetic Association Studies*
  • Genetic Variation*
  • Genomics / methods
  • Humans
  • Models, Genetic
  • Mutation
  • Penetrance