Analysis of DNA sequence variants detected by high-throughput sequencing

Hum Mutat. 2012 Apr;33(4):599-608. doi: 10.1002/humu.22035. Epub 2012 Feb 28.

Abstract

The Undiagnosed Diseases Program at the National Institutes of Health uses high-throughput sequencing (HTS) to diagnose rare and novel diseases. HTS techniques generate large numbers of DNA sequence variants, which must be analyzed and filtered to find candidates for disease causation. Despite the publication of an increasing number of successful exome-based projects, there has been little formal discussion of the analytic steps applied to HTS variant lists. We present the results of our experience with over 30 families for whom HTS sequencing was used in an attempt to find clinical diagnoses. For each family, exome sequence was augmented with high-density SNP-array data. We present a discussion of the theory and practical application of each analytic step and provide example data to illustrate our approach. The article is designed to provide an analytic roadmap for variant analysis, thereby enabling a wide range of researchers and clinical genetics practitioners to perform direct analysis of HTS data for their patients and projects.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, N.I.H., Intramural

MeSH terms

  • Exome
  • Family
  • Genetic Diseases, Inborn / diagnosis*
  • Genetic Diseases, Inborn / genetics*
  • Genetic Variation
  • High-Throughput Nucleotide Sequencing / methods*
  • Humans
  • Software*