The nucleotide sequence of the entire chicken chromosomal ovalbumin gene has been determined. The gene is 7564 nucleotides in length to code for a mature messenger RNA of 1872 nucleotides. Comparison of the sequence at the 5'-terminal region of the gene with that reported by others has revealed multiple polymorphic nucleotides in the structural, intervening, and flanking DNA sequences. Some of the polymorphic sites occur at positions very close to splice junctions or the eucaryotic promoter sequence, yet apparently have little or no effect on the expression of this gene. The heptanucleotide promoter sequence TATATAT present in the 5'-flanking region of the ovalbumin gene does not occur within the confines of the gene. Nevertheless, multiple Hogness box sequences similar to those found in other eucaryotic genes were delineated within the boundaries of the gene. These internal Hogness box sequences are not used for transcription initiation. Similarly, the hexanucleotide sequence AATAAA common to all eucaryotic messenger RNAs at the 3'-untranslated region occurs seven additional times within the ovalbumin gene. These sites are not used for transcription termination or polyadenylation. Thus, although these sequences may play important roles in the initiation or termination of gene transcripts as well as polyadenylation of the transcripts, the specificity for such biological functions must not reside within these sequences alone. Furthermore, sequences complementary to the highly conserved rat U1 small nuclear RNA have been found throughout the gene. Many of these regions of complementarity occur in the structural sequences. If the small nuclear RNA does play a role in splicing, the specificity must be provided also by other as yet undefined components.