Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Random generation of RNA secondary structures according to native distributions.

Literature DB >> 21992500

Random generation of RNA secondary structures according to native distributions.

Markus E Nebel¹, Anika Scheid, Frank Weinberg.

Abstract

BACKGROUND: Random biological sequences are a topic of great interest in genome analysis since, according to a powerful paradigm, they represent the background noise from which the actual biological information must differentiate. Accordingly, the generation of random sequences has been investigated for a long time. Similarly, random object of a more complicated structure like RNA molecules or proteins are of interest.
RESULTS: In this article, we present a new general framework for deriving algorithms for the non-uniform random generation of combinatorial objects according to the encoding and probability distribution implied by a stochastic context-free grammar. Briefly, the framework extends on the well-known recursive method for (uniform) random generation and uses the popular framework of admissible specifications of combinatorial classes, introducing weighted combinatorial classes to allow for the non-uniform generation by means of unranking. This framework is used to derive an algorithm for the generation of RNA secondary structures of a given fixed size. We address the random generation of these structures according to a realistic distribution obtained from real-life data by using a very detailed context-free grammar (that models the class of RNA secondary structures by distinguishing between all known motifs in RNA structure). Compared to well-known sampling approaches used in several structure prediction tools (such as SFold) ours has two major advantages: Firstly, after a preprocessing step in time O(n2) for the computation of all weighted class sizes needed, with our approach a set of m random secondary structures of a given structure size n can be computed in worst-case time complexity Om⋅n⋅ log(n) while other algorithms typically have a runtime in O(m⋅n2). Secondly, our approach works with integer arithmetic only which is faster and saves us from all the discomforting details of using floating point arithmetic with logarithmized probabilities.
CONCLUSION: A number of experimental results shows that our random generation method produces realistic output, at least with respect to the appearance of the different structural motifs. The algorithm is available as a webservice at http://wwwagak.cs.uni-kl.de/NonUniRandGen and can be used for generating random secondary structures of any specified RNA type. A link to download an implementation of our method (in Wolfram Mathematica) can be found there, too.

Entities: Chemical Disease Gene Species

Year: 2011 PMID： 21992500 PMCID： PMC3354341 DOI： 10.1186/1748-7188-6-24

Source DB: PubMed Journal: Algorithms Mol Biol ISSN： 1748-7188 Impact factor: 1.405

25 in total

Random generation of RNA secondary structures according to native distributions.

1. RNA secondary structure prediction using stochastic context-free grammars and evolutionary history.

2. The European Large Subunit Ribosomal RNA Database.

3. Complete suboptimal folding of RNA and the stability of secondary structures.

4. The European database on small subunit ribosomal RNA.

5. Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure.

6. Combinatorial properties of RNA secondary structures.

7. Mfold web server for nucleic acid folding and hybridization prediction.

8. Pfold: RNA secondary structure prediction using stochastic context-free grammars.

9. Vienna RNA secondary structure server.

10. An evolutionary model for protein-coding regions with conserved RNA structure.

1. Evaluation of a sophisticated SCFG design for RNA secondary structure prediction.

2. Improved gravitation field algorithm and its application in hierarchical clustering.