Comparison of microarray designs for class comparison and class discovery

Bioinformatics. 2002 Nov;18(11):1438-45. doi: 10.1093/bioinformatics/18.11.1438.

Abstract

Motivation: Two-color microarray experiments in which an aliquot derived from a common RNA sample is placed on each array are called reference designs. Traditionally, microarray experiments have used reference designs, but designs without a reference have recently been proposed as alternatives.

Results: We develop a statistical model that distinguishes the different levels of variation typically present in cancer data, including biological variation among RNA samples, experimental error and variation attributable to phenotype. Within the context of this model, we examine the reference design and two designs which do not use a reference, the balanced block design and the loop design, focusing particularly on efficiency of estimates and the performance of cluster analysis. We calculate the relative efficiency of designs when there are a fixed number of arrays available, and when there are a fixed number of samples available. Monte Carlo simulation is used to compare the designs when the objective is class discovery based on cluster analysis of the samples. The number of discrepancies between the estimated clusters and the true clusters were significantly smaller for the reference design than for the loop design. The efficiency of the reference design relative to the loop and block designs depends on the relation between inter- and intra-sample variance. These results suggest that if cluster analysis is a major goal of the experiment, then a reference design is preferable. If identification of differentially expressed genes is the main concern, then design selection may involve a consideration of several factors.

Publication types

  • Comparative Study
  • Evaluation Study
  • Validation Study

MeSH terms

  • Algorithms
  • Cluster Analysis
  • Computer Simulation
  • Equipment Failure Analysis / methods
  • Gene Expression Profiling / methods*
  • Gene Expression Profiling / standards
  • Gene Expression Regulation / genetics
  • Models, Genetic*
  • Models, Statistical
  • Monte Carlo Method
  • Oligonucleotide Array Sequence Analysis / instrumentation
  • Oligonucleotide Array Sequence Analysis / methods*
  • Oligonucleotide Array Sequence Analysis / standards
  • Quality Control
  • RNA / classification*
  • RNA / genetics*
  • Reference Standards
  • Reproducibility of Results
  • Sample Size
  • Sensitivity and Specificity
  • Sequence Analysis, DNA / methods*
  • Sequence Analysis, DNA / standards

Substances

  • RNA