Classification of RNA structure change by 'gazing' at experimental data

Bioinformatics. 2017 Jun 1;33(11):1647-1655. doi: 10.1093/bioinformatics/btx041.

Abstract

Motivation: Mutations (or Single Nucleotide Variants) in folded RiboNucleic Acid structures that cause local or global conformational change are riboSNitches. Predicting riboSNitches is challenging, as it requires making two, albeit related, structure predictions. The data most often used to experimentally validate riboSNitch predictions is Selective 2' Hydroxyl Acylation by Primer Extension, or SHAPE. Experimentally establishing a riboSNitch requires the quantitative comparison of two SHAPE traces: wild-type (WT) and mutant. Historically, SHAPE data was collected on electropherograms and change in structure was evaluated by 'gel gazing.' SHAPE data is now routinely collected with next generation sequencing and/or capillary sequencers. We aim to establish a classifier capable of simulating human 'gazing' by identifying features of the SHAPE profile that human experts agree 'looks' like a riboSNitch.

Results: We find strong quantitative agreement between experts when RNA scientists 'gaze' at SHAPE data and identify riboSNitches. We identify dynamic time warping and seven other features predictive of the human consensus. The classSNitch classifier reported here accurately reproduces human consensus for 167 mutant/WT comparisons with an Area Under the Curve (AUC) above 0.8. When we analyze 2019 mutant traces for 17 different RNAs, we find that features of the WT SHAPE reactivity allow us to improve thermodynamic structure predictions of riboSNitches. This is significant, as accurate RNA structural analysis and prediction is likely to become an important aspect of precision medicine.

Availability and implementation: The classSNitch R package is freely available at http://classsnitch.r-forge.r-project.org .

Contact: alain@email.unc.edu.

Supplementary information: Supplementary data are available at Bioinformatics online.

MeSH terms

  • Classification
  • High-Throughput Nucleotide Sequencing / methods
  • Humans
  • Mutation
  • Nucleic Acid Conformation
  • Polymorphism, Single Nucleotide*
  • RNA / chemistry*
  • RNA / genetics*
  • RNA / metabolism
  • Sequence Analysis, RNA / methods*
  • Software*
  • Thermodynamics

Substances

  • RNA