The frequency and position of Alu repeats in cDNAs, as determined by database searching

Genomics. 1995 Jun 10;27(3):544-8. doi: 10.1006/geno.1995.1090.

Abstract

The Alu repeat sequence is estimated to account for 5% of human genomic DNA. The precise relationship of Alu sequences to human fully spliced cDNA has yet to be determined, although many new protocols for cloning cDNAs either depend on the presence of Alus or--more usually--rely on their absence in a population of messages. Previous estimates of the percentage of fully spliced human transcripts that contain Alu repeats have relied on hybridization procedures. Here we have gone directly to the DNA sequence by extracting over 1600 entries from GenBank that are described as human complete cDNAs, and we have assessed the frequency with which the Alu repeat sequence occurs in these sequences. We find that 5% of fully spliced human cDNAs contain Alu sequences. In addition, we have quantified the appearance of Alus in the different cDNA regions, 5' untranslated region (UTR), coding region, and 3' UTR. The vast majority of Alus are found in the 3' UTR, but 14% lie in the 5' UTR, and very rarely an Alu sequence is present within, or partially within, the coding region of the transcript.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • DNA, Complementary / genetics*
  • Databases, Factual
  • Genome, Human
  • Humans
  • Protein Biosynthesis
  • Repetitive Sequences, Nucleic Acid*

Substances

  • DNA, Complementary