Jens Reeder1, Robert Giegerich. 1. Faculty of Technology, Bielefeld University, 33615 Bielefeld, Germany. jreeder@TechFak.Uni-Bielefeld.DE
Abstract
MOTIVATION: The well-known Sankoff algorithm for simultaneous RNA sequence alignment and folding is currently considered an ideal, but computationally over-expensive method. Available tools implement this algorithm under various pragmatic restrictions. They are still expensive to use, and it is difficult to judge if the moderate quality of results is because of the underlying model or to its imperfect implementation. RESULTS: We propose to redefine the consensus structure prediction problem in a way that does not imply a multiple sequence alignment step. For a family of RNA sequences, our method explicitly and independently enumerates the near-optimal abstract shape space, and predicts as the consensus an abstract shape common to all sequences. For each sequence, it delivers the thermodynamically best structure which has this common shape. Since the shape space is much smaller than the structure space, and identification of common shapes can be done in linear time (in the number of shapes considered), the method is essentially linear in the number of sequences. Our evaluation shows that the new method compares favorably with available alternatives. AVAILABILITY: The new method has been implemented in the program RNAcast and is available on the Bielefeld Bioinformatics Server. CONTACT: jreeder@TechFak.Uni-Bielefeld.DE, robert@TechFak.Uni-Bielefeld.DE SUPPLEMENTARY INFORMATION: Available at http://bibiserv.techfak.uni-bielefeld.de/rnacast/supplementary.html
MOTIVATION: The well-known Sankoff algorithm for simultaneous RNA sequence alignment and folding is currently considered an ideal, but computationally over-expensive method. Available tools implement this algorithm under various pragmatic restrictions. They are still expensive to use, and it is difficult to judge if the moderate quality of results is because of the underlying model or to its imperfect implementation. RESULTS: We propose to redefine the consensus structure prediction problem in a way that does not imply a multiple sequence alignment step. For a family of RNA sequences, our method explicitly and independently enumerates the near-optimal abstract shape space, and predicts as the consensus an abstract shape common to all sequences. For each sequence, it delivers the thermodynamically best structure which has this common shape. Since the shape space is much smaller than the structure space, and identification of common shapes can be done in linear time (in the number of shapes considered), the method is essentially linear in the number of sequences. Our evaluation shows that the new method compares favorably with available alternatives. AVAILABILITY: The new method has been implemented in the program RNAcast and is available on the Bielefeld Bioinformatics Server. CONTACT: jreeder@TechFak.Uni-Bielefeld.DE, robert@TechFak.Uni-Bielefeld.DE SUPPLEMENTARY INFORMATION: Available at http://bibiserv.techfak.uni-bielefeld.de/rnacast/supplementary.html
Authors: Ebbe S Andersen; Allan Lind-Thomsen; Bjarne Knudsen; Susie E Kristensen; Jakob H Havgaard; Elfar Torarinsson; Niels Larsen; Christian Zwieb; Peter Sestoft; Jørgen Kjems; Jan Gorodkin Journal: RNA Date: 2007-09-05 Impact factor: 4.942
Authors: Yelena V Lerman; Scott D Kennedy; Neelaabh Shankar; Marc Parisien; Francois Major; Douglas H Turner Journal: RNA Date: 2011-07-21 Impact factor: 4.942
Authors: Roman R Stocsits; Harald Letsch; Jana Hertel; Bernhard Misof; Peter F Stadler Journal: Nucleic Acids Res Date: 2009-09-01 Impact factor: 16.971