MGS-Fast: Metagenomic shotgun data fast annotation using microbial gene catalogs

Gigascience. 2019 Apr 1;8(4):giz020. doi: 10.1093/gigascience/giz020.

Abstract

Background: Current methods used for annotating metagenomics shotgun sequencing (MGS) data rely on a computationally intensive and low-stringency approach of mapping each read to a generic database of proteins or reference microbial genomes.

Results: We developed MGS-Fast, an analysis approach for shotgun whole-genome metagenomic data utilizing Bowtie2 DNA-DNA alignment of reads that is an alternative to using the integrated catalog of reference genes database of well-annotated genes compiled from human microbiome data. This method is rapid and provides high-stringency matches (>90% DNA sequence identity) of the metagenomics reads to genes with annotated functions. We demonstrate the use of this method with data from a study of liver disease and synthetic reads, and Human Microbiome Project shotgun data, to detect differentially abundant Kyoto Encyclopedia of Genes and Genomes gene functions in these experiments. This rapid annotation method is freely available as a Galaxy workflow within a Docker image.

Conclusions: MGS-Fast can confidently transfer functional annotations from gene databases to metagenomic reads, with speed and accuracy.

Keywords: Docker; Galaxy; annotation; cloud computing; metagenomics.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Cloud Computing
  • Computational Biology / methods*
  • Humans
  • Metagenome
  • Metagenomics / methods*
  • Microbiology
  • Microbiota
  • Molecular Sequence Annotation
  • Reproducibility of Results
  • Software*
  • Workflow