Literature DB >> 21992500

Random generation of RNA secondary structures according to native distributions.

Markus E Nebel1, Anika Scheid, Frank Weinberg.   

Abstract

BACKGROUND: Random biological sequences are a topic of great interest in genome analysis since, according to a powerful paradigm, they represent the background noise from which the actual biological information must differentiate. Accordingly, the generation of random sequences has been investigated for a long time. Similarly, random object of a more complicated structure like RNA molecules or proteins are of interest.
RESULTS: In this article, we present a new general framework for deriving algorithms for the non-uniform random generation of combinatorial objects according to the encoding and probability distribution implied by a stochastic context-free grammar. Briefly, the framework extends on the well-known recursive method for (uniform) random generation and uses the popular framework of admissible specifications of combinatorial classes, introducing weighted combinatorial classes to allow for the non-uniform generation by means of unranking. This framework is used to derive an algorithm for the generation of RNA secondary structures of a given fixed size. We address the random generation of these structures according to a realistic distribution obtained from real-life data by using a very detailed context-free grammar (that models the class of RNA secondary structures by distinguishing between all known motifs in RNA structure). Compared to well-known sampling approaches used in several structure prediction tools (such as SFold) ours has two major advantages: Firstly, after a preprocessing step in time O(n2) for the computation of all weighted class sizes needed, with our approach a set of m random secondary structures of a given structure size n can be computed in worst-case time complexity Om⋅n⋅ log(n) while other algorithms typically have a runtime in O(m⋅n2). Secondly, our approach works with integer arithmetic only which is faster and saves us from all the discomforting details of using floating point arithmetic with logarithmized probabilities.
CONCLUSION: A number of experimental results shows that our random generation method produces realistic output, at least with respect to the appearance of the different structural motifs. The algorithm is available as a webservice at http://wwwagak.cs.uni-kl.de/NonUniRandGen and can be used for generating random secondary structures of any specified RNA type. A link to download an implementation of our method (in Wolfram Mathematica) can be found there, too.

Entities:  

Year:  2011        PMID: 21992500      PMCID: PMC3354341          DOI: 10.1186/1748-7188-6-24

Source DB:  PubMed          Journal:  Algorithms Mol Biol        ISSN: 1748-7188            Impact factor:   1.405


  25 in total

1.  RNA secondary structure prediction using stochastic context-free grammars and evolutionary history.

Authors:  B Knudsen; J Hein
Journal:  Bioinformatics       Date:  1999-06       Impact factor: 6.937

2.  The European Large Subunit Ribosomal RNA Database.

Authors:  J Wuyts; P De Rijk; Y Van de Peer; T Winkelmans; R De Wachter
Journal:  Nucleic Acids Res       Date:  2001-01-01       Impact factor: 16.971

3.  Complete suboptimal folding of RNA and the stability of secondary structures.

Authors:  S Wuchty; W Fontana; I L Hofacker; P Schuster
Journal:  Biopolymers       Date:  1999-02       Impact factor: 2.505

4.  The European database on small subunit ribosomal RNA.

Authors:  Jan Wuyts; Yves Van de Peer; Tina Winkelmans; Rupert De Wachter
Journal:  Nucleic Acids Res       Date:  2002-01-01       Impact factor: 16.971

5.  Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure.

Authors:  D H Mathews; J Sabina; M Zuker; D H Turner
Journal:  J Mol Biol       Date:  1999-05-21       Impact factor: 5.469

6.  Combinatorial properties of RNA secondary structures.

Authors:  Markus E Nebel
Journal:  J Comput Biol       Date:  2002       Impact factor: 1.479

7.  Mfold web server for nucleic acid folding and hybridization prediction.

Authors:  Michael Zuker
Journal:  Nucleic Acids Res       Date:  2003-07-01       Impact factor: 16.971

8.  Pfold: RNA secondary structure prediction using stochastic context-free grammars.

Authors:  Bjarne Knudsen; Jotun Hein
Journal:  Nucleic Acids Res       Date:  2003-07-01       Impact factor: 16.971

9.  Vienna RNA secondary structure server.

Authors:  Ivo L Hofacker
Journal:  Nucleic Acids Res       Date:  2003-07-01       Impact factor: 16.971

10.  An evolutionary model for protein-coding regions with conserved RNA structure.

Authors:  Jakob Skou Pedersen; Roald Forsberg; Irmtraud Margret Meyer; Jotun Hein
Journal:  Mol Biol Evol       Date:  2004-06-30       Impact factor: 16.240

View more
  2 in total

1.  Evaluation of a sophisticated SCFG design for RNA secondary structure prediction.

Authors:  Markus E Nebel; Anika Scheid
Journal:  Theory Biosci       Date:  2011-12-02       Impact factor: 1.919

2.  Improved gravitation field algorithm and its application in hierarchical clustering.

Authors:  Ming Zheng; Ying Sun; Gui-Xia Liu; You Zhou; Chun-Guang Zhou
Journal:  PLoS One       Date:  2012-11-16       Impact factor: 3.240

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.