Genome-wide analyses of retrogenes derived from the human box H/ACA snoRNAs

Nucleic Acids Res. 2007;35(2):559-71. doi: 10.1093/nar/gkl1086. Epub 2006 Dec 14.

Abstract

The family of box H/ACA snoRNA is an abundant class of non-protein-coding RNAs, which play important roles in the post-transcriptional modification of rRNAs and snRNAs. Here we report the characterization in the human genome of 202 sequences derived from box H/ACA snoRNAs. Most of them were retrogenes formed using the L1 integration machinery. About 96% of the box H/ACA RNA-related sequences are found in corresponding locations on the chimpanzee and human chromosomes, while the mouse shares approximately 50% of these human sequences, suggesting that some of the H/ACA RNA-related sequences in primate occurred after the rodent/primate divergence. Of the H/ACA RNA-related sequences, 49% are found in intronic regions of protein-coding genes and 64 H/ACA-related sequences can be folded to the typical secondary structure of the box H/ACA snoRNA family, while 30 of them were recognized as functional homologs of their corresponding box H/ACA snoRNAs previously reported. Of the 64 sequences with the typical secondary structure of the box H/ACA RNA family, 11 were found in EST databases and 5 among which were shown to be expressed in more than one human tissue. Notably, U107f is nested in an intron of a protein gene coding for nudix-type motif 13, but expressed from the opposite strand, and the searching of EST databases revealed it can be expressed in liver and spleen, even in melanotic melanoma.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Base Sequence
  • Computational Biology
  • Genome, Human*
  • Genomics
  • Humans
  • Long Interspersed Nucleotide Elements*
  • Mice
  • Molecular Sequence Data
  • Phylogeny
  • RNA, Small Nucleolar / chemistry
  • RNA, Small Nucleolar / genetics*
  • RNA, Small Nucleolar / metabolism
  • Vertebrates / genetics

Substances

  • RNA, Small Nucleolar