A high quality Arabidopsis transcriptome for accurate transcript-level analysis of alternative splicing

Nucleic Acids Res. 2017 May 19;45(9):5061-5073. doi: 10.1093/nar/gkx267.

Abstract

Alternative splicing generates multiple transcript and protein isoforms from the same gene and thus is important in gene expression regulation. To date, RNA-sequencing (RNA-seq) is the standard method for quantifying changes in alternative splicing on a genome-wide scale. Understanding the current limitations of RNA-seq is crucial for reliable analysis and the lack of high quality, comprehensive transcriptomes for most species, including model organisms such as Arabidopsis, is a major constraint in accurate quantification of transcript isoforms. To address this, we designed a novel pipeline with stringent filters and assembled a comprehensive Reference Transcript Dataset for Arabidopsis (AtRTD2) containing 82,190 non-redundant transcripts from 34 212 genes. Extensive experimental validation showed that AtRTD2 and its modified version, AtRTD2-QUASI, for use in Quantification of Alternatively Spliced Isoforms, outperform other available transcriptomes in RNA-seq analysis. This strategy can be implemented in other species to build a pipeline for transcript-level expression and alternative splicing analyses.

MeSH terms

  • Alternative Splicing*
  • Arabidopsis / genetics*
  • Genes, Insect*
  • Genetic Variation
  • Proteomics
  • RNA, Untranslated
  • Reference Values
  • Reproducibility of Results
  • Sequence Analysis, RNA
  • Transcription, Genetic
  • Transcriptome*

Substances

  • RNA, Untranslated