Accurate detection of mosaic variants in sequencing data without matched controls

Nat Biotechnol. 2020 Mar;38(3):314-319. doi: 10.1038/s41587-019-0368-8. Epub 2020 Jan 6.

Abstract

Detection of mosaic mutations that arise in normal development is challenging, as such mutations are typically present in only a minute fraction of cells and there is no clear matched control for removing germline variants and systematic artifacts. We present MosaicForecast, a machine-learning method that leverages read-based phasing and read-level features to accurately detect mosaic single-nucleotide variants and indels, achieving a multifold increase in specificity compared with existing algorithms. Using single-cell sequencing and targeted sequencing, we validated 80-90% of the mosaic single-nucleotide variants and 60-80% of indels detected in human brain whole-genome sequencing data. Our method should help elucidate the contribution of mosaic somatic mutations to the origin and development of disease.

Publication types

  • Letter
  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Brain Chemistry
  • Germ-Line Mutation
  • Humans
  • INDEL Mutation*
  • Machine Learning
  • Mosaicism
  • Polymorphism, Single Nucleotide*
  • Single-Cell Analysis / methods*
  • Software
  • Whole Genome Sequencing / methods*