Ontogenomic study of the relationship between number of gene splice variants and GO categorization

Bioinformatics. 2010 Aug 15;26(16):1945-9. doi: 10.1093/bioinformatics/btq335. Epub 2010 Jul 8.

Abstract

Motivation: Splice variation plays important roles in evolution and cancer. Different splice variants of a gene may be characteristic of particular cellular processes, subcellular locations or organs. Although several genomic projects have identified splice variants, there have been no large-scale computational studies of the relationship between number of splice variants and biological function. The Gene Ontology (GO) and tools for leveraging GO, such as GoMiner, now make such a study feasible.

Results: We partitioned genes into two groups: those with numbers of splice variants <or=b and >b (b=1,..., 10). Then we used GoMiner to determine whether any GO categories are enriched in genes with particular numbers of splice variants. Since there was no a priori 'appropriate' partition boundary, we studied those 'robust' categories whose enrichment did not depend on the selection of a particular partition boundary. Furthermore, because the distribution of splice variant number was a snapshot taken at a particular point in time, we confirmed that those observations were stable across successive builds of GenBank. A small number of categories were found for genes in the lower partitions. A larger number of categories were found for genes in the higher partitions. Those categories were largely associated with cell death and signal transduction. Apoptotic genes tended to have a large repertoire of splice variants, and genes with splice variants exhibited a distinctive 'apoptotic island' in clustered image maps (CIMs).

Availability: Supplementary tables and figures are available at URL http://discover.nci.nih.gov/OG/supplementaryMaterials.html. The Safari browser appears to perform better than Firefox for these particular items.

Publication types

  • Research Support, N.I.H., Intramural

MeSH terms

  • Alternative Splicing*
  • Cluster Analysis
  • Databases, Nucleic Acid
  • Genes
  • Genetic Variation
  • Genome
  • Genomics / methods*
  • Humans
  • Signal Transduction / genetics
  • Software