Motivation: Splice variation plays important roles in evolution and cancer. Different splice variants of a gene may be characteristic of particular cellular processes, subcellular locations or organs. Although several genomic projects have identified splice variants, there have been no large-scale computational studies of the relationship between number of splice variants and biological function. The Gene Ontology (GO) and tools for leveraging GO, such as GoMiner, now make such a study feasible.
Results: We partitioned genes into two groups: those with numbers of splice variants <or=b and >b (b=1,..., 10). Then we used GoMiner to determine whether any GO categories are enriched in genes with particular numbers of splice variants. Since there was no a priori 'appropriate' partition boundary, we studied those 'robust' categories whose enrichment did not depend on the selection of a particular partition boundary. Furthermore, because the distribution of splice variant number was a snapshot taken at a particular point in time, we confirmed that those observations were stable across successive builds of GenBank. A small number of categories were found for genes in the lower partitions. A larger number of categories were found for genes in the higher partitions. Those categories were largely associated with cell death and signal transduction. Apoptotic genes tended to have a large repertoire of splice variants, and genes with splice variants exhibited a distinctive 'apoptotic island' in clustered image maps (CIMs).
Availability: Supplementary tables and figures are available at URL http://discover.nci.nih.gov/OG/supplementaryMaterials.html. The Safari browser appears to perform better than Firefox for these particular items.