Literature DB >> 17452349

INFO-RNA--a server for fast inverse RNA folding satisfying sequence constraints.

Abstract

INFO-RNA is a new web server for designing RNA sequences that fold into a user given secondary structure. Furthermore, constraints on the sequence can be specified, e.g. one can restrict sequence positions to a fixed nucleotide or to a set of nucleotides. Moreover, the user can allow violations of the constraints at some positions, which can be advantageous in complicated cases. The INFO-RNA web server allows biologists to design RNA sequences in an automatic manner. It is clearly and intuitively arranged and easy to use. The procedure is fast, as most applications are completed within seconds and it proceeds better and faster than other existing tools. The INFO-RNA web server is freely available at http://www.bioinf.uni-freiburg.de/Software/INFO-RNA/

Entities: Chemical Disease Gene Species

Mesh：

Substances：
RNA

Year: 2007 PMID： 17452349 PMCID： PMC1933236 DOI： 10.1093/nar/gkm218

Source DB: PubMed Journal: Nucleic Acids Res ISSN： 0305-1048 Impact factor: 16.971

INTRODUCTION

The function of RNA molecules often depends on both the primary sequence and the secondary structure. RNAs are involved in translation (tRNA, rRNA), splicing (snRNA), processing of other RNAs (snoRNA, RNAseP) and regulatory processes (miRNA, siRNA) (1). Furthermore, parts of mRNAs can adopt structures that regulate their own translation (SECIS (2,3), IRE (4)). Since prediction and experimental determination of 3D RNA structures remain difficult, much work focuses on problems associated with its secondary structure, which is the set of base pairs. The problem of predicting the secondary structure of an RNA is called the ‘RNA folding problem’. Existing computational approaches are based on a thermodynamic model that gives a free energy value for each secondary structure (5). The structure with the lowest free energy [called the ‘minimum free energy (mfe) structure’] is expected to be the most stable one. Here, we consider the ‘inverse RNA folding problem satisfying sequence constraints’, which is the design of RNA sequences that fold into a desired structure and fulfill some given constraints on the primary sequence. These constraints can restrict certain positions to fixed nucleotides or to a fixed set of nucleotides. The INFO-RNA web server is applicable to the design of RNA elements that include conserved nucleotides, which are essential for binding of proteins.

METHODS AND USAGE

The INFO-RNA server uses a new algorithm for the INverse FOlding of RNA that involves two steps. The first step contains a new design method for good initial sequences. It is followed by an improved stochastic local search. Both steps are described shortly in the following and more in detail (6).

The initializing step

The input of the algorithm consists of the target structure. During the first step of INFO-RNA, a dynamic programming approach designs an RNA sequence that adopts the lowest energy a sequence can have when folding into the target structure. However, this sequence is not guaranteed to fold into the target structure since this sequence can have another mfe structure. Therefore, the resulting sequence is processed further in a second step.

The local search step

To improve the quality of the sequence generated in the first step, local sequence mutations are made iteratively. In INFO-RNA, this is done by a ‘stochastic local search’ (SLS) that minimizes the structure distance between the mfe structure of the designed sequence and the target structure. Here, sequence neighbors are tested either in a random order or in an order that depends on the energy difference between the current sequence and the neighbor sequence when folding into the target structure. The higher the difference is, the earlier the mutation is examined. Optionally, the probability of folding into the wanted structure can be optimized as well.

Novel extensions of the algorithm

In an extension to (6), the INFO-RNA web server can handle a set of user-given constraints on the primary sequence. These constraints have to be fulfilled during both steps of the algorithm. That means, after finishing the initializing step, we get a sequence that adopts the target structure with the lowest energy that is possible if the constraints are fulfilled. During the local search step, only mutations that coincide with the constraints are valid. If the constraints on the sequence are not strictly fixed, the user can specify some positions where violations of the constraints are allowed. Furthermore, the user can restrict the maximal number of constraints that are violated in the final sequence (Vmax). This might be useful if one allows violations of two different constraints but wants at most only one of these violations in the designed RNA sequence. Finally, the INFO-RNA server outputs the best-found RNA sequence satisfying the sequence constraints with at most violations.

Usage

The INFO-RNA web server is clearly and intuitively arranged. In order to obtain an RNA sequence folding into a target structure and satisfying some sequence constraints, both (structure and sequence constraints) have to be given. The structure has to be given in bracket notation. Here, a base pair between bases i and j is represented by a ‘(’at the i-th position and a ‘)’ at position j. Unpaired bases are represented by dots. The sequence constraints have to be entered in IUPAC symbols, where e.g. restricting a position to Y means that a C or a U is allowed there. In addition, the user can choose some positions where the constraints are allowed to be violated during the local search. Besides, the maximal number of positions where the constraints are allowed to be violated in the final sequence can be specified. Furthermore, the user can fix some parameters used during the stochastic local search, e.g. the search strategy of either only minimizing the structure distance or additionally maximizing the folding probability as well as the search order of the sequence positions. Finally, the user can choose whether the results are shown on the web page or send via email. For all options, a comprehensive help and detailed examples are given. Figure 1 shows the output of a typical computation. First, the input data are summarized. Below, the designed sequence is shown including information about its mfe structure, its free energy and its folding probability. Additionally, the user can download the results in FASTA, CT and RNAML format.

Figure 1.

INFO-RNA web server output. The figure shows the output of a typical computation (design of an IRE with fixed bases in the interior and hairpin loop and a maximum of two constraint violations at three possible sequence positions).

RESULTS AND APPLICATION

The INFO-RNA web server allows biologists to design RNA sequences, which fold into a given structure, in an automatic manner. The procedure is fast, as most applications are completed within seconds. As shown in (6), INFO-RNA (not considering sequence constraints) proceeds better and faster than other existing tools. Artificial as well as biological test sets were analyzed. The biological test sets divide into computationally predicted structures for known RNA sequences and structures from the biological literature. INFO-RNA turned out to be the algorithm having the highest succession rates as well as the lowest computation times for all test sets. Additional stability tests showed that the designed sequences are more stable than the biological ones. The novel extension of INFO-RNA including sequence constraints allows the design of cis-acting mRNA elements such as the ‘iron responsive element’ (IRE) and the ‘polyadenylation inhibition element’ (PIE). Both elements have conserved sequence positions in loops. The IRE is essential for the expression of proteins that are involved in the iron metabolism (7). It consists of a stem-loop structure, and the first five nucleotides in the hairpin loop as well as the bulged nucleotides were found to be essential for binding of iron-regulatory proteins. The PIE contains two binding sites for U1A proteins (8). It consists of a stem structure with two asymmetric internal loops that serve as U1A-binding sites (Figure 2). Using the INFO-RNA web server, we designed artificial IREs and PIEs having a much higher folding probability compared to natural elements. While designed sequences for the IRE having a single C bulge fold into the target structure with an average probability of 88%, natural sequences do so only with an average probability of 15%. Regarding IREs having an interior loop with left size 3 and right size 1, the results are similar. Furthermore, the average probability of the designed PIE sequences folding into the target structure is more than 20 times higher than the probability of the natural PIE sequences (Supplementary Figure 1). Besides, all IREs designed by the INFO-RNA web server adopt the wanted structure as its mfe structure whereas only a small fraction of the natural ones does (Supplementary Figure 2).

Figure 2.

Structures and conserved sequence positions of a PIE. The figure shows the consensus structure and conserved sequence positions of a PIE that contains two asymmetrical internal loops as binding sites for U1A proteins (U1A-PIE). Conserved sequence positions are highlighted in gray. Furthermore, we demonstrated the usability of the INFO-RNA web server by designing artificial microRNA (miRNA) precursors that are as stable as possible. To this end, artificial miRNA sequences published in (9) were used. Applying the INFO-RNA web server, we designed precursors of these artificial miRNAs as well as of the natural miRNA. All of the designed sequences have a free energy that is at least twice as low as the free energy of the natural precursor sequences. On average, their probability of folding into the target miRNA precursor structure is five times as high as the folding probability of the natural precursor sequences. For more details see Supplementary Table 1. Other potential application areas are the design of ribozymes and riboswitches (10), which may be used in research and medicine, and the design of non-coding RNAs, which are involved in a large variety of processes, e.g. gene regulation, chromosome replication and RNA modification (11).

DISCUSSION

We have shown that the INFO-RNA web server is a very fast and successful tool to design RNA sequences, which fold into a given structure and fulfill some sequence constraints. The core of the algorithm was introduced in (6). There, we already showed that INFO-RNA (not considering sequence constraints) proceeds better and faster than other existing tools. Here, we have demonstrated that the INFO-RNA web server, which can handle additional constraints on the primary sequence, also performs well and fast. Most of the sequences designed by the INFO-RNA web server are highly stable and have very low free energy. This might result from the high GC content that most of the sequences show since G–C base pairs are energetically most favorable. It is not clear whether such highly stable structures are always of advantage or how the high GC content may influence the kinetics of the folding process. To reduce the GC content, the user can constrain some positions to A and/or U. In the future, it is desirable to extend the algorithm to allow the user to specify the GC content.

SUPPLEMENTARY DATA

Supplementary Data are available at NAR Online.

11 in total

1. The NMR structure of the 38 kDa U1A protein - PIE RNA complex reveals the basis of cooperativity in regulation of polyadenylation by human U1A protein.

Authors: L Varani; S I Gunderson; I W Mattaj; L E Kay; D Neuhaus; G Varani
Journal: Nat Struct Biol Date: 2000-04

2. An expanding universe of noncoding RNAs.

Authors: Gisela Storz
Journal: Science Date: 2002-05-17 Impact factor: 47.728

Review 3. RNomics: identification and function of small, non-messenger RNAs.

Authors: Alexander Hüttenhofer; Jürgen Brosius; Jean Pierre Bachellerie
Journal: Curr Opin Chem Biol Date: 2002-12 Impact factor: 8.822

4. Gene regulation: switched on to RNA.

Authors: Jonathan Knight
Journal: Nature Date: 2003-09-18 Impact factor: 49.962

5. INFO-RNA--a fast approach to inverse RNA folding.

Authors: Anke Busch; Rolf Backofen
Journal: Bioinformatics Date: 2006-05-18 Impact factor: 6.937

6. Highly specific gene silencing by artificial microRNAs in Arabidopsis.

Authors: Rebecca Schwab; Stephan Ossowski; Markus Riester; Norman Warthmann; Detlef Weigel
Journal: Plant Cell Date: 2006-03-10 Impact factor: 11.277

7. Prediction of RNA secondary structure by energy minimization.

Authors: M Zuker
Journal: Methods Mol Biol Date: 1994

8. Structure and dynamics of the iron responsive element RNA: implications for binding of the RNA by iron regulatory binding proteins.

Authors: K J Addess; J P Basilion; R D Klausner; T A Rouault; A Pardi
Journal: J Mol Biol Date: 1997-11-21 Impact factor: 5.469

9. The nature of the minimal 'selenocysteine insertion sequence' (SECIS) in Escherichia coli.

Authors: Z Liu; M Reches; I Groisman; H Engelberg-Kulka
Journal: Nucleic Acids Res Date: 1998-02-15 Impact factor: 16.971

Review 10. Molecular control of vertebrate iron metabolism: mRNA-based regulatory circuits operated by iron, nitric oxide, and oxidative stress.

Authors: M W Hentze; L C Kühn
Journal: Proc Natl Acad Sci U S A Date: 1996-08-06 Impact factor: 11.205

13 in total

1. Multistrand RNA secondary structure prediction and nanostructure design including pseudoknots.

Authors: Eckart Bindewald; Kirill Afonin; Luc Jaeger; Bruce A Shapiro
Journal: ACS Nano Date: 2011-11-17 Impact factor: 15.881

2. Computational strategies for the automated design of RNA nanoscale structures from building blocks using NanoTiler.

Authors: Eckart Bindewald; Calvin Grunewald; Brett Boyle; Mary O'Connor; Bruce A Shapiro
Journal: J Mol Graph Model Date: 2008-05-24 Impact factor: 2.518

3. Design and self-assembly of siRNA-functionalized RNA nanoparticles for use in automated nanomedicine.

Authors: Kirill A Afonin; Wade W Grabow; Faye M Walker; Eckart Bindewald; Marina A Dobrovolskaia; Bruce A Shapiro; Luc Jaeger
Journal: Nat Protoc Date: 2011-12-01 Impact factor: 13.491

Review 4. Favorable biodistribution, specific targeting and conditional endosomal escape of RNA nanoparticles in cancer therapy.

Authors: Congcong Xu; Farzin Haque; Daniel L Jasinski; Daniel W Binzel; Dan Shu; Peixuan Guo
Journal: Cancer Lett Date: 2017-10-05 Impact factor: 8.679