On the use of permutation in and the performance of a class of nonparametric methods to detect differential gene expression

Bioinformatics. 2003 Jul 22;19(11):1333-40. doi: 10.1093/bioinformatics/btg167.

Abstract

Motivation: Recently a class of nonparametric statistical methods, including the empirical Bayes (EB) method, the significance analysis of microarray (SAM) method and the mixture model method (MMM), have been proposed to detect differential gene expression for replicated microarray experiments conducted under two conditions. All the methods depend on constructing a test statistic Z and a so-called null statistic z. The null statistic z is used to provide some reference distribution for Z such that statistical inference can be accomplished. A common way of constructing z is to apply Z to randomly permuted data. Here we point our that the distribution of z may not approximate the null distribution of Z well, leading to possibly too conservative inference. This observation may apply to other permutation-based nonparametric methods. We propose a new method of constructing a null statistic that aims to estimate the null distribution of a test statistic directly.

Results: Using simulated data and real data, we assess and compare the performance of the existing method and our new method when applied in EB, SAM and MMM. Some interesting findings on operating characteristics of EB, SAM and MMM are also reported. Finally, by combining the idea of SAM and MMM, we outline a simple nonparametric method based on the direct use of a test statistic and a null statistic.

Publication types

  • Comparative Study
  • Evaluation Study
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, P.H.S.
  • Validation Study

MeSH terms

  • Algorithms*
  • Computer Simulation
  • Gene Expression Profiling / methods*
  • Gene Expression Regulation, Neoplastic / genetics*
  • Genetic Variation*
  • Humans
  • Leukemia / genetics*
  • Models, Genetic*
  • Models, Statistical*
  • Oligonucleotide Array Sequence Analysis / methods*
  • Reproducibility of Results
  • Sensitivity and Specificity
  • Statistics, Nonparametric