Sebnif: an integrated bioinformatics pipeline for the identification of novel large intergenic noncoding RNAs (lincRNAs)--application in human skeletal muscle cells

PLoS One. 2014 Jan 6;9(1):e84500. doi: 10.1371/journal.pone.0084500. eCollection 2014.

Abstract

Ab initio assembly of transcriptome sequencing data has been widely used to identify large intergenic non-coding RNAs (lincRNAs), a novel class of gene regulators involved in many biological processes. To differentiate real lincRNA transcripts from thousands of assembly artifacts, a series of filtering steps such as filters of transcript length, expression level and coding potential, need to be applied. However, an easy-to-use and publicly available bioinformatics pipeline that integrates these filters is not yet available. Hence, we implemented sebnif, an integrative bioinformatics pipeline to facilitate the discovery of bona fide novel lincRNAs that are suitable for further functional characterization. Specifically, sebnif is the only pipeline that implements an algorithm for identifying high-quality single-exonic lincRNAs that were often omitted in many studies. To demonstrate the usage of sebnif, we applied it on a real biological RNA-seq dataset from Human Skeletal Muscle Cells (HSkMC) and built a novel lincRNA catalog containing 917 highly reliable lincRNAs. Sebnif is available at http://sunlab.lihs.cuhk.edu.hk/sebnif/.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computational Biology / methods*
  • Genomics
  • Humans
  • Molecular Sequence Annotation
  • Muscle Fibers, Skeletal / metabolism*
  • Promoter Regions, Genetic
  • RNA, Untranslated / genetics*
  • Reproducibility of Results
  • Software*
  • Transcriptional Activation
  • Web Browser

Substances

  • RNA, Untranslated

Grants and funding

The work described in this paper was substantially supported by the General Research Funds (GRF) to HW and HS from the Research Grants Council (RGC) of the Hong Kong Special Administrative Region, China (Project Code: CUHK 476310 to HW and CUHK 473211 to HS). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.