Gene trap vectors developed for genome-wide mutagenesis can be used to study factors governing the expression of exons inserted throughout the genome. For example, entrapment vectors consisting of a partial 3'-terminal exon [i.e. a neomycin resistance gene (Neo), a poly(A) site, but no 3' splice site] were typically expressed following insertion into introns, from cellular transcripts that spliced to cryptic 3' splice sites present either within the targeting vector or in the adjacent intron. A vector (U3NeoSV1) containing the wild-type Neo sequence preferentially disrupted genes that spliced in-frame to a cryptic 3' splice site in the Neo coding sequence and expressed functional neomycin phosphotransferase fusion proteins. Removal of the cryptic Neo 3' splice site did not reduce the proportion of clones with inserts in introns; rather, the fusion transcripts utilized cryptic 3' splice sites present in the adjacent intron or generated by virus integration. However, gene entrapment with U3NeoSV2 was considerably more random than with U3NeoSV1, consistent with the widespread occurrence of potential 3' splice site sequences in the introns of cellular genes. These results clarify the mechanisms of gene entrapment by U3 gene trap vectors and illustrate features of exon definition required for 3' processing and polyadenylation of cellular transcripts.