Prediction of Polycomb target genes in mouse embryonic stem cells

Genomics. 2010 Jul;96(1):17-26. doi: 10.1016/j.ygeno.2010.03.012. Epub 2010 Mar 29.

Abstract

Polycomb group (PcG) proteins are important epigenetic regulators, yet the underlying targeting mechanism in mammals is still poorly understood. We have developed a computational approach to predict genome-wide PcG target genes in mouse embryonic stem cells. We use TF binding and motif information as predictors and apply the Bayesian Additive Regression Trees (BART) model for classification. Our model has good prediction accuracy. The performance can be mainly explained by five TF features (Zf5, Tcfcp2l1, Ctcf, E2f1, Myc). Our analysis of H3K27me3 and gene expression data suggests that genomic sequence is highly correlated with the overall PcG target plasticity. We have also compared the PcG target sequence signatures between mouse and Drosophila and found that they are strikingly different. Our predictions may be useful for de novo search for Polycomb response elements (PRE) in mammals.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Bayes Theorem
  • Binding Sites / genetics
  • Chromatin Immunoprecipitation / methods
  • Computational Biology / methods*
  • Consensus Sequence / genetics
  • DNA Methylation
  • Databases, Genetic
  • Drosophila Proteins
  • Drosophila melanogaster / genetics
  • E2F1 Transcription Factor / metabolism
  • Embryonic Stem Cells / metabolism*
  • Gene Expression
  • Genome
  • Histones / metabolism*
  • Lysine / metabolism
  • Mice
  • Microarray Analysis / methods
  • Polycomb Repressive Complex 1
  • Promoter Regions, Genetic / genetics
  • Proto-Oncogene Proteins c-myc / metabolism
  • Repressor Proteins / metabolism
  • Sequence Analysis, DNA / methods
  • Transcription Factors / metabolism

Substances

  • Drosophila Proteins
  • E2F1 Transcription Factor
  • E2f1 protein, mouse
  • Histones
  • Myc protein, mouse
  • Pc protein, Drosophila
  • Proto-Oncogene Proteins c-myc
  • Repressor Proteins
  • Transcription Factors
  • Polycomb Repressive Complex 1
  • Lysine