Leveraging multiple transcriptome assembly methods for improved gene structure annotation

Gigascience. 2018 Aug 1;7(8):giy093. doi: 10.1093/gigascience/giy093.

Abstract

Background: The performance of RNA sequencing (RNA-seq) aligners and assemblers varies greatly across different organisms and experiments, and often the optimal approach is not known beforehand.

Results: Here, we show that the accuracy of transcript reconstruction can be boosted by combining multiple methods, and we present a novel algorithm to integrate multiple RNA-seq assemblies into a coherent transcript annotation. Our algorithm can remove redundancies and select the best transcript models according to user-specified metrics, while solving common artifacts such as erroneous transcript chimerisms.

Conclusions: We have implemented this method in an open-source Python3 and Cython program, Mikado, available on GitHub.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Animals
  • Gene Expression Profiling / methods*
  • High-Throughput Nucleotide Sequencing / methods
  • Humans
  • Molecular Sequence Annotation / methods*
  • Plants / genetics
  • Sequence Analysis, RNA / methods*
  • Software

Associated data

  • figshare/10.6084/m9.figshare.5551585.v1