Mixture models and wavelet transforms reveal high confidence RNA-protein interaction sites in MOV10 PAR-CLIP data

Nucleic Acids Res. 2012 Nov 1;40(20):e160. doi: 10.1093/nar/gks697. Epub 2012 Jul 28.

Abstract

The Photo-Activatable Ribonucleoside-enhanced CrossLinking and ImmunoPrecipitation (PAR-CLIP) method was recently developed for global identification of RNAs interacting with proteins. The strength of this versatile method results from induction of specific T to C transitions at sites of interaction. However, current analytical tools do not distinguish between non-experimentally and experimentally induced transitions. Furthermore, geometric properties at potential binding sites are not taken into account. To surmount these shortcomings, we developed a two-step algorithm consisting of a non-parametric two-component mixture model and a wavelet-based peak calling procedure. Our algorithm can reduce the number of false positives up to 24% thereby identifying high confidence interaction sites. We successfully employed this approach in conjunction with a modified PAR-CLIP protocol to study the functional role of nuclear Moloney leukemia virus 10, a putative RNA helicase interacting with Argonaute2 and Polycomb. Our method, available as the R package wavClusteR, is generally applicable to any substitution-based inference problem in genomics.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Bayes Theorem
  • Binding Sites
  • HEK293 Cells
  • Humans
  • Immunoprecipitation / methods
  • Models, Statistical*
  • RNA / chemistry
  • RNA / metabolism*
  • RNA Helicases / metabolism*
  • RNA-Binding Proteins / metabolism*
  • Sequence Analysis, RNA
  • Wavelet Analysis*

Substances

  • RNA-Binding Proteins
  • RNA
  • Mov10 protein, human
  • RNA Helicases

Associated data

  • GEO/GSE37524