Human chromosome 5q11.2-q13.3 and its ortholog on mouse chromosome 13 contain candidate genes for an inherited human neurodegenerative disorder called spinal muscular atrophy (SMA) and for an inherited mouse susceptibility to infection with Legionella pneumophila (Lgn1). These homologous genomic regions also have unusual repetitive organizations that create practical difficulties in mapping and raise interesting issues about the evolutionary origin of the repeats. In an attempt to analyze this region in detail, and as a way to identify additional candidate genes for these diseases, we have determined the sequence of 179 kb of the mouse Lgn1/SMA interval. We have analyzed this sequence using BLAST searches and various exon prediction programs to identify potential genes. Since these methods can generate false-positive exon declarations, our alignments of the mouse sequence with available human orthologous sequence allowed us to discriminate rapidly among this collection of potential coding regions by indicating which regions were well conserved and were more likely to represent actual coding sequence. As a result of our analysis, we accurately mapped two additional genes in the SMA interval that can be tested for involvement in the pathogenesis of SMA. While no new Lgn1 candidates emerged, we have identified new genetic markers that exclude Smn as an Lgn1 candidate. In addition to providing important resources for studying SMA and Lgn1, our data provide further evidence of the value of sequencing the mouse genome as a means to help with the annotation of the human genomic sequence and vice versa.
Copyright 1999 Academic Press.