Interrogating the "unsequenceable" genomic trinucleotide repeat disorders by long-read sequencing

Genome Med. 2017 Jul 18;9(1):65. doi: 10.1186/s13073-017-0456-7.

Abstract

Microsatellite expansion, such as trinucleotide repeat expansion (TRE), is known to cause a number of genetic diseases. Sanger sequencing and next-generation short-read sequencing are unable to interrogate TRE reliably. We developed a novel algorithm called RepeatHMM to estimate repeat counts from long-read sequencing data. Evaluation on simulation data, real amplicon sequencing data on two repeat expansion disorders, and whole-genome sequencing data generated by PacBio and Oxford Nanopore technologies showed superior performance over competing approaches. We concluded that long-read sequencing coupled with RepeatHMM can estimate repeat counts on microsatellites and can interrogate the "unsequenceable" genomic trinucleotide repeat disorders.

Keywords: Long-read sequencing; Microsatellites; Nanopore; PacBio; RepeatHMM; Trinucleotide repeat disorders; Trinucleotide repeats.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms*
  • High-Throughput Nucleotide Sequencing / methods*
  • Humans
  • Sequence Analysis, DNA / methods*
  • Trinucleotide Repeats*