Comparison of splice sites reveals that long noncoding RNAs are evolutionarily well conserved

RNA. 2015 May;21(5):801-12. doi: 10.1261/rna.046342.114. Epub 2015 Mar 23.

Abstract

Large-scale RNA sequencing has revealed a large number of long mRNA-like transcripts (lncRNAs) that do not code for proteins. The evolutionary history of these lncRNAs has been notoriously hard to study systematically due to their low level of sequence conservation that precludes comprehensive homology-based surveys and makes them nearly impossible to align. An increasing number of special cases, however, has been shown to be at least as old as the vertebrate lineage. Here we use the conservation of splice sites to trace the evolution of lncRNAs. We show that >85% of the human GENCODE lncRNAs were already present at the divergence of placental mammals and many hundreds of these RNAs date back even further. Nevertheless, we observe a fast turnover of intron/exon structures. We conclude that lncRNA genes are evolutionary ancient components of vertebrate genomes that show an unexpected and unprecedented evolutionary plasticity. We offer a public web service (http://splicemap.bioinf.uni-leipzig.de) that allows to retrieve sets of orthologous splice sites and to produce overview maps of evolutionarily conserved splice sites for visualization and further analysis. An electronic supplement containing the ncRNA data sets used in this study is available at http://www.bioinf.uni-leipzig.de/publications/supplements/12-001.

Keywords: conservation; evolution; evolutionary plasticity; lncRNA; long noncoding RNAs; multiple sequence alignments; splice sites.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Chromosome Mapping
  • Conserved Sequence*
  • Evolution, Molecular*
  • Humans
  • Mammals / genetics
  • Phylogeny
  • Primates / genetics
  • RNA Splice Sites / genetics*
  • RNA Splicing
  • RNA, Long Noncoding / genetics*
  • RNA, Long Noncoding / metabolism
  • RNA, Messenger / genetics
  • Sequence Analysis, RNA

Substances

  • RNA Splice Sites
  • RNA, Long Noncoding
  • RNA, Messenger