An integrative approach to predicting the functional effects of small indels in non-coding regions of the human genome

BMC Bioinformatics. 2017 Oct 6;18(1):442. doi: 10.1186/s12859-017-1862-y.

Abstract

Background: Small insertions and deletions (indels) have a significant influence in human disease and, in terms of frequency, they are second only to single nucleotide variants as pathogenic mutations. As the majority of mutations associated with complex traits are located outside the exome, it is crucial to investigate the potential pathogenic impact of indels in non-coding regions of the human genome.

Results: We present FATHMM-indel, an integrative approach to predict the functional effect, pathogenic or neutral, of indels in non-coding regions of the human genome. Our method exploits various genomic annotations in addition to sequence data. When validated on benchmark data, FATHMM-indel significantly outperforms CADD and GAVIN, state of the art models in assessing the pathogenic impact of non-coding variants. FATHMM-indel is available via a web server at indels.biocompute.org.uk.

Conclusions: FATHMM-indel can accurately predict the functional impact and prioritise small indels throughout the whole non-coding genome.

Keywords: Indels; Non-coding genome; Support vector machines; Variant prioritisation.

MeSH terms

  • Computational Biology / methods*
  • DNA, Intergenic / genetics*
  • Genetics, Population
  • Genome, Human*
  • Humans
  • INDEL Mutation / genetics*
  • Phenotype
  • ROC Curve
  • Reproducibility of Results
  • Software

Substances

  • DNA, Intergenic