Background: Long noncoding RNAs (lncRNAs) have roles in gene regulation, epigenetics, and molecular scaffolding and it is hypothesized that they underlie some mammalian evolutionary adaptations. However, for many mammalian species, the absence of a genome assembly precludes the comprehensive identification of lncRNAs. The genome of the American beaver (Castor canadensis) has recently been sequenced, setting the stage for the systematic identification of beaver lncRNAs and the characterization of their expression in various tissues. The objective of this study was to discover and profile polyadenylated lncRNAs in the beaver using high-throughput short-read sequencing of RNA from sixteen beaver tissues and to annotate the resulting lncRNAs based on their potential for orthology with known lncRNAs in other species.
Results: Using de novo transcriptome assembly, we found 9528 potential lncRNA contigs and 187 high-confidence lncRNA contigs. Of the high-confidence lncRNA contigs, 147 have no known orthologs (and thus are putative novel lncRNAs) and 40 have mammalian orthologs. The novel lncRNAs mapped to the Oregon State University (OSU) reference beaver genome with greater than 90% sequence identity. While the novel lncRNAs were on average shorter than their annotated counterparts, they were similar to the annotated lncRNAs in terms of the relationships between contig length and minimum free energy (MFE) and between coverage and contig length. We identified beaver orthologs of known lncRNAs such as XIST, MEG3, TINCR, and NIPBL-DT. We profiled the expression of the 187 high-confidence lncRNAs across 16 beaver tissues (whole blood, brain, lung, liver, heart, stomach, intestine, skeletal muscle, kidney, spleen, ovary, placenta, castor gland, tail, toe-webbing, and tongue) and identified both tissue-specific and ubiquitous lncRNAs.
Conclusions: To our knowledge this is the first report of systematic identification of lncRNAs and their expression atlas in beaver. LncRNAs-both novel and those with known orthologs-are expressed in each of the beaver tissues that we analyzed. For some beaver lncRNAs with known orthologs, the tissue-specific expression patterns were phylogenetically conserved. The lncRNA sequence data files and raw sequence files are available via the web supplement and the NCBI Sequence Read Archive, respectively.
Keywords: Beaver; Castor canadensis; Expression atlas; Long noncoding RNA; Transcriptome; lncRNA.