Literature DB >> 17921494

Biclustering as a method for RNA local multiple sequence alignment.

Shu Wang1, Robin R Gutell, Daniel P Miranker.   

Abstract

MOTIVATIONS: Biclustering is a clustering method that simultaneously clusters both the domain and range of a relation. A challenge in multiple sequence alignment (MSA) is that the alignment of sequences is often intended to reveal groups of conserved functional subsequences. Simultaneously, the grouping of the sequences can impact the alignment; precisely the kind of dual situation biclustering is intended to address.
RESULTS: We define a representation of the MSA problem enabling the application of biclustering algorithms. We develop a computer program for local MSA, BlockMSA, that combines biclustering with divide-and-conquer. BlockMSA simultaneously finds groups of similar sequences and locally aligns subsequences within them. Further alignment is accomplished by dividing both the set of sequences and their contents. The net result is both a multiple sequence alignment and a hierarchical clustering of the sequences. BlockMSA was tested on the subsets of the BRAliBase 2.1 benchmark suite that display high variability and on an extension to that suite to larger problem sizes. Also, alignments were evaluated of two large datasets of current biological interest, T box sequences and Group IC1 Introns. The results were compared with alignments computed by ClustalW, MAFFT, MUCLE and PROBCONS alignment programs using Sum of Pairs (SPS) and Consensus Count. Results for the benchmark suite are sensitive to problem size. On problems of 15 or greater sequences, BlockMSA is consistently the best. On none of the problems in the test suite are there appreciable differences in scores among BlockMSA, MAFFT and PROBCONS. On the T box sequences, BlockMSA does the most faithful job of reproducing known annotations. MAFFT and PROBCONS do not. On the Intron sequences, BlockMSA, MAFFT and MUSCLE are comparable at identifying conserved regions. AVAILABILITY: BlockMSA is implemented in Java. Source code and supplementary datasets are available at http://aug.csres.utexas.edu/msa/

Entities:  

Mesh:

Year:  2007        PMID: 17921494      PMCID: PMC2228335          DOI: 10.1093/bioinformatics/btm485

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  22 in total

1.  BAliBASE: a benchmark alignment database for the evaluation of multiple alignment programs.

Authors:  J D Thompson; F Plewniak; O Poch
Journal:  Bioinformatics       Date:  1999-01       Impact factor: 6.937

2.  T-Coffee: A novel method for fast and accurate multiple sequence alignment.

Authors:  C Notredame; D G Higgins; J Heringa
Journal:  J Mol Biol       Date:  2000-09-08       Impact factor: 5.469

Review 3.  The T box and S box transcription termination control systems.

Authors:  Frank J Grundy; Tina M Henkin
Journal:  Front Biosci       Date:  2003-01-01

Review 4.  The accuracy of ribosomal RNA comparative structure models.

Authors:  Robin R Gutell; Jung C Lee; Jamie J Cannone
Journal:  Curr Opin Struct Biol       Date:  2002-06       Impact factor: 6.809

5.  ProbCons: Probabilistic consistency-based multiple sequence alignment.

Authors:  Chuong B Do; Mahathi S P Mahabhashyam; Michael Brudno; Serafim Batzoglou
Journal:  Genome Res       Date:  2005-02       Impact factor: 9.043

Review 6.  Self-splicing of group I introns.

Authors:  T R Cech
Journal:  Annu Rev Biochem       Date:  1990       Impact factor: 23.643

7.  Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment.

Authors:  C E Lawrence; S F Altschul; M S Boguski; J S Liu; A F Neuwald; J C Wootton
Journal:  Science       Date:  1993-10-08       Impact factor: 47.728

8.  An enhanced RNA alignment benchmark for sequence alignment programs.

Authors:  Andreas Wilm; Indra Mainz; Gerhard Steger
Journal:  Algorithms Mol Biol       Date:  2006-10-24       Impact factor: 1.405

9.  Rfam: annotating non-coding RNAs in complete genomes.

Authors:  Sam Griffiths-Jones; Simon Moxon; Mhairi Marshall; Ajay Khanna; Sean R Eddy; Alex Bateman
Journal:  Nucleic Acids Res       Date:  2005-01-01       Impact factor: 16.971

Review 10.  Recent evolutions of multiple sequence alignment algorithms.

Authors:  Cédric Notredame
Journal:  PLoS Comput Biol       Date:  2007-08       Impact factor: 4.475

View more
  4 in total

1.  Predicting consensus structures for RNA alignments via pseudo-energy minimization.

Authors:  Junilda Spirollari; Jason T L Wang; Kaizhong Zhang; Vivian Bellofatto; Yongkyu Park; Bruce A Shapiro
Journal:  Bioinform Biol Insights       Date:  2009-06-03

2.  A bi-ordering approach to linking gene expression with clinical annotations in gastric cancer.

Authors:  Fan Shi; Christopher Leckie; Geoff MacIntyre; Izhak Haviv; Alex Boussioutas; Adam Kowalczyk
Journal:  BMC Bioinformatics       Date:  2010-09-23       Impact factor: 3.169

3.  PicXAA-R: efficient structural alignment of multiple RNA sequences using a greedy approach.

Authors:  Sayed Mohammad Ebrahim Sahraeian; Byung-Jun Yoon
Journal:  BMC Bioinformatics       Date:  2011-02-15       Impact factor: 3.169

4.  PicXAA-Web: a web-based platform for non-progressive maximum expected accuracy alignment of multiple biological sequences.

Authors:  Sayed Mohammad Ebrahim Sahraeian; Byung-Jun Yoon
Journal:  Nucleic Acids Res       Date:  2011-04-22       Impact factor: 16.971

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.