Literature DB >> 26353838

SHAPE directed RNA folding.

Ronny Lorenz¹, Dominik Luntzer², Ivo L Hofacker¹, Peter F Stadler¹, Michael T Wolfinger¹.

Abstract

SUMMARY: Chemical mapping experiments allow for nucleotide resolution assessment of RNA structure. We demonstrate that different strategies of integrating probing data with thermodynamics-based RNA secondary structure prediction algorithms can be implemented by means of soft constraints. This amounts to incorporating suitable pseudo-energies into the standard energy model for RNA secondary structures. As a showcase application for this new feature of the ViennaRNA Package we compare three distinct, previously published strategies to utilize SHAPE reactivities for structure prediction. The new tool is benchmarked on a set of RNAs with known reference structure.
AVAILABILITY AND IMPLEMENTATION: The capability for SHAPE directed RNA folding is part of the upcoming release of the ViennaRNA Package 2.2, for which a preliminary release is already freely available at http://www.tbi.univie.ac.at/RNA. CONTACT: michael.wolfinger@univie.ac.at SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Entities: Chemical Species

Mesh：

Substances：
RNA, Ribosomal

Year: 2015 PMID： 26353838 PMCID： PMC4681990 DOI： 10.1093/bioinformatics/btv523

Source DB: PubMed Journal: Bioinformatics ISSN： 1367-4803 Impact factor: 6.937

1 Introduction

Beyond its role as information carrier from genome to proteome, RNA is a key player in genome regulation and contributes to a wide variety of cellular tasks. The spatial structure of RNA plays an important role in this context because it critically influences the interaction of RNAs with proteins and with nucleic acids. Knowledge of RNA structure is therefore crucial for understanding various biological processes. Chemical and enzymatic probing methods provide information concerning the flexibility and accessibility at nucleotide resolution. They are based on the observation that RNA can be selectively modified by small organic molecules, metal ions or RNAse enzymes, resulting in formation of an adduct between the RNA and the small compound or RNA cleavage. Subsequent primer extension mediated by RT enzymes typically terminates at the modified sites. The resulting cDNA fragments thus inform directly on the RNA structure by identifying, depending on the particular reagent, paired or unpaired sequence positions. For a recent overview of such (high-throughput) probing methods we refer to Mortimer . As chemical probing is becoming a frequently used technology to determining RNA structure experimentally, there is increasing demand for efficient and accurate computational methods to incorporate probing data into secondary structure prediction tools. Efficient dynamic programming algorithms, as implemented in the ViennaRNA Package (Lorenz ), typically yield excellent prediction results for short sequences, but accuracy decreases to between 40 and 70% for long RNA sequences. This discrepancy is mainly caused by imperfect thermodynamic parameters and the inherent limitations of the secondary structure model, such as tertiary interactions, pseudoknots, ligand binding or kinetics traps. To alleviate the gap in available computational tools we have extended the by a flexible framework to incorporate all those soft constraints that are compatible with the RNA folding grammar; here we use this to handle position-wise data as they arise from chemical probing experiments.

2 Methods

In contrast to hard constraints (Mathews ), which restrict the folding space on the level of the generating function, soft constraints leave the structure ensemble intact. They rather guide the folding process by adding position-, or motif-specific pseudo-energy contributions to the free energy contributions of certain loop motifs. This amounts to a distortion of the equilibrium ensemble of structure in favour of those that are consistent with experimental data. Mismatching motifs are penalized by positive contributions, while structure patterns where prediction and experiment agree with each other receive a ‘bonus’ in form of a negative pseudo-energy. Bonus energies are an old idea in RNA folding algorithms (see Supplementary Material). Current methods for guided secondary structure prediction by means of soft constraints mainly focus on the incorporation of SHAPE reactivity data. For that purpose, three algorithms are available that aim to transform normalized SHAPE reactivity data into meaningful pseudo-energy terms. The first method published uses a simple linear ansatz to derive pseudo-energies for individual nucleotides that take part in a stacked helix conformation (Deigan ). All remaining structural conformations are not modified in this model. A more consistent model that considers pseudo-energy guided free energy modifications in all loop types was introduced by Zarringhalam . Here, the authors first convert the provided SHAPE reactivity data for each nucleotide into a probability to be unpaired. Subsequently, the resulting probabilities are used to derive two nucleotide-wise pseudo-energy weights, one for contexts where the nucleotide is considerd unpaired, and the other for situations where it is involved in a base pair. A third, distinct approach on incorporating SHAPE reactivity data to guide secondary structure prediction was suggested by Washietl . Here, the authors phrase the choice of the bonus energies as an optimization problem that aims to find a perturbation vector of pseudo-energies that minimizes the discrepancy between the observed and predicted probabilities to see particular nucleotides unpaired. At the same time, the perturbation should be as small as possible. The tradeoff between the two goals is naturally defined by the relative uncertainties inherent in the SHAPE measurements and the energy model, respectively. A detailed description of the three conversion methods is given in the Supplementary Material.

2.1 Implementation

All three methods outlined earlier have been implemented into the ViennaRNA Package, and are available via the API of the ViennaRNA Library and the command line interface of RNAfold. The required changes to the folding recursions and technical details of handling both hard and soft constraints in ViennaRNA will be described elsewhere in full detail. The key feature for our purposes is the consistent incorporation of a user defined position dependent energy contribution for each nucleotide that remains unpaired. The novel standalone tool RNApvmin dynamically estimates a vector of pseudo-energies that minimize model adjustments and discrepancies between observed and predicted pairing probabilities. The resulting perturbation vector can then be used to guide structure prediction with RNAfold. By accepting either SHAPE reactivity data, probabilities to be unpaired, or bonus energies directly, RNAfold allows to incorporate alternative ways of computing bonus energies, e.g. along the lines of Eddy (2014), or the application to other types of probing data. The novel soft constraint feature introduces a variety of parameters which need to be chosen carefully. We refer to the Supplementary material for a detailed summary of their default values. Guided structure prediction has also been included into the ViennaRNA Websuite (Gruber ), available at http://rna.tbi.univie.ac.at.

3 Results

We applied the methods to a benchmark set with known reference structures (Hajdin ). This test set contains 24 triples of sequences, their corresponding SHAPE data, and reference structures, either derived from X-ray crystallography experiments or predicted by comparative sequence analysis. The use of SHAPE data driven soft constraints leads to improved prediction results for many RNAs. This is clearly visible in the predictions for our benchmark data set (see Fig. 1, and Supplementary Material). However, for some of the RNAs within our benchmark data the additional pseudo-energy terms impair prediction results. This may be due to two factors. First, experimental data always comes with a certain inaccuracy. Second, the underlying energy model excludes pseudoknotted structures, which are present in approximately half of the benchmarked RNAs. Additionally, pseudoknot interactions are reflected in the SHAPE data itself.

Fig. 1.

Secondary structure prediction of Escherichia coli 5S rRNA from our benchmark data set. (A) Structure reference, (B) prediction by RNAfold with default parameters and (C) prediction by RNAfold with guiding pseudo-energies obtained from SHAPE reactivity data using RNApvmin. Structure plots created using the forna Web server (Kerpedjiev ). White nucleotides correspond to missing SHAPE reactivity data Incorporation of probing data not only affects the minimum free energy structure, but also the entire ensemble of structures. Consequently, the predicted pairing probabilities are shifted towards the observed reactivity pattern. However, the effect is less distinct in the model of Washietl (see Supplementary Fig. S11). While Deigan’s method has the best average performance on our data, neither approach consistently outperforms the others. In addition to the benchmark data, we use an artificially designed theophylline sensing riboswitch to compare the three SHAPE conversion methods with a prediction that directly includes ligand binding free energy of the aptamer (see Supplementary Material S5).

10 in total

1. Accurate SHAPE-directed RNA structure determination.

Authors: Katherine E Deigan; Tian W Li; David H Mathews; Kevin M Weeks
Journal: Proc Natl Acad Sci U S A Date: 2008-12-24 Impact factor: 11.205

2. Accurate SHAPE-directed RNA secondary structure modeling, including pseudoknots.

Authors: Christine E Hajdin; Stanislav Bellaousov; Wayne Huggins; Christopher W Leonard; David H Mathews; Kevin M Weeks
Journal: Proc Natl Acad Sci U S A Date: 2013-03-15 Impact factor: 11.205

Review 3. Insights into RNA structure and function from genome-wide studies.

Authors: Stefanie A Mortimer; Mary Anne Kidwell; Jennifer A Doudna
Journal: Nat Rev Genet Date: 2014-05-13 Impact factor: 53.242

4. Incorporating chemical modification constraints into a dynamic programming algorithm for prediction of RNA secondary structure.

Authors: David H Mathews; Matthew D Disney; Jessica L Childs; Susan J Schroeder; Michael Zuker; Douglas H Turner
Journal: Proc Natl Acad Sci U S A Date: 2004-05-03 Impact factor: 11.205

Review 5. Computational analysis of conserved RNA secondary structure in transcriptomes and genomes.

Authors: Sean R Eddy
Journal: Annu Rev Biophys Date: 2014 Impact factor: 12.981

6. ViennaRNA Package 2.0.

Authors: Ronny Lorenz; Stephan H Bernhart; Christian Höner Zu Siederdissen; Hakim Tafer; Christoph Flamm; Peter F Stadler; Ivo L Hofacker
Journal: Algorithms Mol Biol Date: 2011-11-24 Impact factor: 1.405

7. RNA folding with soft constraints: reconciliation of probing data and thermodynamic secondary structure prediction.

Authors: Stefan Washietl; Ivo L Hofacker; Peter F Stadler; Manolis Kellis
Journal: Nucleic Acids Res Date: 2012-01-28 Impact factor: 16.971

8. The Vienna RNA websuite.

Authors: Andreas R Gruber; Ronny Lorenz; Stephan H Bernhart; Richard Neuböck; Ivo L Hofacker
Journal: Nucleic Acids Res Date: 2008-04-19 Impact factor: 16.971

9. Forna (force-directed RNA): Simple and effective online RNA secondary structure diagrams.

Authors: Peter Kerpedjiev; Stefan Hammer; Ivo L Hofacker
Journal: Bioinformatics Date: 2015-06-22 Impact factor: 6.937

10. Integrating chemical footprinting data into RNA secondary structure prediction.

Authors: Kourosh Zarringhalam; Michelle M Meyer; Ivan Dotu; Jeffrey H Chuang; Peter Clote
Journal: PLoS One Date: 2012-10-16 Impact factor: 3.240

10 in total

37 in total

1. Comparative and integrative analysis of RNA structural profiling data: current practices and emerging questions.

Authors: Krishna Choudhary; Fei Deng; Sharon Aviran
Journal: Quant Biol Date: 2017-03-30

2. RNA Footprinting Using Small Chemical Reagents.

Authors: Grégoire De Bisschop; Bruno Sargueil
Journal: Methods Mol Biol Date: 2021

3. SEQualyzer: interactive tool for quality control and exploratory analysis of high-throughput RNA structural profiling data.

Authors: Krishna Choudhary; Luyao Ruan; Fei Deng; Nathan Shih; Sharon Aviran
Journal: Bioinformatics Date: 2017-02-01 Impact factor: 6.937

4. Metrics for rapid quality control in RNA structure probing experiments.

Authors: Krishna Choudhary; Nathan P Shih; Fei Deng; Mirko Ledda; Bo Li; Sharon Aviran
Journal: Bioinformatics Date: 2016-08-06 Impact factor: 6.937

5. A high-throughput approach to profile RNA structure.

Authors: Riccardo Delli Ponti; Stefanie Marti; Alexandros Armaos; Gian Gaetano Tartaglia
Journal: Nucleic Acids Res Date: 2017-03-17 Impact factor: 16.971

6. Complementary Tendencies in the Use of Regulatory Elements (Transcription Factors, Sigma Factors, and Riboswitches) in Bacteria and Archaea.

Authors: Joselyn Chávez; Damien P Devos; Enrique Merino
Journal: J Bacteriol Date: 2020-12-18 Impact factor: 3.490

10. Characterizing RNA structures in vitro and in vivo with selective 2'-hydroxyl acylation analyzed by primer extension sequencing (SHAPE-Seq).

Authors: Kyle E Watters; Angela M Yu; Eric J Strobel; Alex H Settle; Julius B Lucks
Journal: Methods Date: 2016-04-12 Impact factor: 3.608