A single ChIP-seq dataset is sufficient for comprehensive analysis of motifs co-occurrence with MCOT package

Nucleic Acids Res. 2019 Dec 2;47(21):e139. doi: 10.1093/nar/gkz800.

Abstract

Recognition of composite elements consisting of two transcription factor binding sites gets behind the studies of tissue-, stage- and condition-specific transcription. Genome-wide data on transcription factor binding generated with ChIP-seq method facilitate an identification of composite elements, but the existing bioinformatics tools either require ChIP-seq datasets for both partner transcription factors, or omit composite elements with motifs overlapping. Here we present an universal Motifs Co-Occurrence Tool (MCOT) that retrieves maximum information about overrepresented composite elements from a single ChIP-seq dataset. This includes homo- and heterotypic composite elements of four mutual orientations of motifs, separated with a spacer or overlapping, even if recognition of motifs within composite element requires various stringencies. Analysis of 52 ChIP-seq datasets for 18 human transcription factors confirmed that for over 60% of analyzed datasets and transcription factors predicted co-occurrence of motifs implied experimentally proven protein-protein interaction of respecting transcription factors. Analysis of 164 ChIP-seq datasets for 57 mammalian transcription factors showed that abundance of predicted composite elements with an overlap of motifs compared to those with a spacer more than doubled; and they had 1.5-fold increase of asymmetrical pairs of motifs with one more conservative 'leading' motif and another one 'guided'.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Animals
  • Binding Sites
  • Chromatin Immunoprecipitation Sequencing / methods*
  • Computational Biology / methods*
  • Datasets as Topic
  • Humans
  • Mice
  • Nucleotide Motifs / genetics
  • Regulatory Elements, Transcriptional / genetics*
  • Sequence Analysis, DNA / methods*
  • Transcription Factors / genetics*

Substances

  • Transcription Factors