Robert K Bradley1, Lior Pachter, Ian Holmes. 1. Biophysics Graduate Group, Department of Mathematics and Department of Bioengineering, University of California, Berkeley, CA 94720, USA.
Abstract
MOTIVATION: Whole-genome screens suggest that eukaryotic genomes are dense with non-coding RNAs (ncRNAs). We introduce a novel approach to RNA multiple alignment which couples a generative probabilistic model of sequence and structure with an efficient sequence annealing approach for exploring the space of multiple alignments. This leads to a new software program, Stemloc-AMA, that is both accurate and specific in the alignment of multiple related RNA sequences. RESULTS: When tested on the benchmark datasets BRalibase II and BRalibase 2.1, Stemloc-AMA has comparable sensitivity to and better specificity than the best competing methods. We use a large-scale random sequence experiment to show that while most alignment programs maximize sensitivity at the expense of specificity, even to the point of giving complete alignments of non-homologous sequences, Stemloc-AMA aligns only sequences with detectable homology and leaves unrelated sequences largely unaligned. Such accurate and specific alignments are crucial for comparative-genomics analysis, from inferring phylogeny to estimating substitution rates across different lineages. AVAILABILITY: Stemloc-AMA is available from http://biowiki.org/StemLocAMA as part of the dart software package for sequence analysis.
MOTIVATION: Whole-genome screens suggest that eukaryotic genomes are dense with non-coding RNAs (ncRNAs). We introduce a novel approach to RNA multiple alignment which couples a generative probabilistic model of sequence and structure with an efficient sequence annealing approach for exploring the space of multiple alignments. This leads to a new software program, Stemloc-AMA, that is both accurate and specific in the alignment of multiple related RNA sequences. RESULTS: When tested on the benchmark datasets BRalibase II and BRalibase 2.1, Stemloc-AMA has comparable sensitivity to and better specificity than the best competing methods. We use a large-scale random sequence experiment to show that while most alignment programs maximize sensitivity at the expense of specificity, even to the point of giving complete alignments of non-homologous sequences, Stemloc-AMA aligns only sequences with detectable homology and leaves unrelated sequences largely unaligned. Such accurate and specific alignments are crucial for comparative-genomics analysis, from inferring phylogeny to estimating substitution rates across different lineages. AVAILABILITY: Stemloc-AMA is available from http://biowiki.org/StemLocAMA as part of the dart software package for sequence analysis.
Authors: J Graham Ruby; Alexander Stark; Wendy K Johnston; Manolis Kellis; David P Bartel; Eric C Lai Journal: Genome Res Date: 2007-11-07 Impact factor: 9.043
Authors: Sebastian Will; Kristin Reiche; Ivo L Hofacker; Peter F Stadler; Rolf Backofen Journal: PLoS Comput Biol Date: 2007-02-22 Impact factor: 4.475