A robust estimation of exon expression to identify alternative spliced genes applied to human tissues and cancer samples

BMC Genomics. 2014 Oct 8;15(1):879. doi: 10.1186/1471-2164-15-879.

Abstract

Background: Accurate analysis of whole-gene expression and individual-exon expression is essential to characterize different transcript isoforms and identify alternative splicing events in human genes. One of the omic technologies widely used in many studies on human samples are the exon-specific expression microarray platforms.

Results: Since there are not many validated comparative analyses to identify specific splicing events using data derived from these types of platforms, we have developed an algorithm (called ESLiM) to detect significant changes in exon use, and applied it to a reference dataset of 270 human genes that show alternative expression in different tissues. We compared the results with three other methodological approaches and provided the R source code to be applied elsewhere. The genes positively detected by these analyses also provide a verified subset of human genes that present tissue-regulated isoforms. Furthermore, we performed a validation analysis on human patient samples comparing two different subtypes of acute myeloid leukemia (AML) and we experimentally validated the splicing in several selected genes that showed exons with highly significant signal change.

Conclusions: The comparative analyses with other methods using a fair set of human genes that show alternative splicing and the validation on clinical samples demonstrate that the proposed novel algorithm is a reliable tool for detecting differential splicing in exon-level expression data.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Alternative Splicing*
  • Databases, Genetic
  • Exons
  • Gene Expression Profiling / methods*
  • Humans
  • Leukemia, Myeloid, Acute / genetics*
  • Oligonucleotide Array Sequence Analysis / methods*
  • Organ Specificity
  • Protein Isoforms / genetics*
  • Reproducibility of Results

Substances

  • Protein Isoforms