Literature DB >> 14980012

An Eulerian path approach to global multiple alignment for DNA sequences.

Yu Zhang1, Michael S Waterman.   

Abstract

With the rapid increase in the dataset of genome sequences, the multiple sequence alignment problem is increasingly important and frequently involves the alignment of a large number of sequences. Many heuristic algorithms have been proposed to improve the speed of computation and the quality of alignment. We introduce a novel approach that is fundamentally different from all currently available methods. Our motivation comes from the Eulerian method for fragment assembly in DNA sequencing that transforms all DNA fragments into a de Bruijn graph and then reduces sequence assembly to a Eulerian path problem. The paper focuses on global multiple alignment of DNA sequences, where entire sequences are aligned into one configuration. Our main result is an algorithm with almost linear computational speed with respect to the total size (number of letters) of sequences to be aligned. Five hundred simulated sequences (averaging 500 bases per sequence and as low as 70% pairwise identity) have been aligned within three minutes on a personal computer, and the quality of alignment is satisfactory. As a result, accurate and simultaneous alignment of thousands of long sequences within a reasonable amount of time becomes possible. Data from an Arabidopsis sequencing project is used to demonstrate the performance.

Entities:  

Mesh:

Substances:

Year:  2003        PMID: 14980012     DOI: 10.1089/106652703322756096

Source DB:  PubMed          Journal:  J Comput Biol        ISSN: 1066-5277            Impact factor:   1.479


  5 in total

1.  Rational design of DNA sequences for nanotechnology, microarrays and molecular computers using Eulerian graphs.

Authors:  Petr Pancoska; Zdenek Moravek; Ute M Moll
Journal:  Nucleic Acids Res       Date:  2004-08-27       Impact factor: 16.971

2.  De novo repeat classification and fragment assembly.

Authors:  Pavel A Pevzner; Paul A Pevzner; Haixu Tang; Glenn Tesler
Journal:  Genome Res       Date:  2004-09       Impact factor: 9.043

3.  A novel method for multiple alignment of sequences with repeated and shuffled elements.

Authors:  Benjamin Raphael; Degui Zhi; Haixu Tang; Pavel Pevzner
Journal:  Genome Res       Date:  2004-11       Impact factor: 9.043

4.  An Eulerian path approach to local multiple alignment for DNA sequences.

Authors:  Yu Zhang; Michael S Waterman
Journal:  Proc Natl Acad Sci U S A       Date:  2005-01-24       Impact factor: 11.205

5.  Plant Reactome: a knowledgebase and resource for comparative pathway analysis.

Authors:  Sushma Naithani; Parul Gupta; Justin Preece; Peter D'Eustachio; Justin L Elser; Priyanka Garg; Daemon A Dikeman; Jason Kiff; Justin Cook; Andrew Olson; Sharon Wei; Marcela K Tello-Ruiz; Antonio Fabregat Mundo; Alfonso Munoz-Pomer; Suhaib Mohammed; Tiejun Cheng; Evan Bolton; Irene Papatheodorou; Lincoln Stein; Doreen Ware; Pankaj Jaiswal
Journal:  Nucleic Acids Res       Date:  2020-01-08       Impact factor: 16.971

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.