Literature DB >> 11473013

Fragment assembly with double-barreled data.

P A Pevzner1, H Tang.   

Abstract

For the last twenty years fragment assembly was dominated by the "overlap - layout - consensus" algorithms that are used in all currently available assembly tools. However, the limits of these algorithms are being tested in the era of genomic sequencing and it is not clear whether they are the best choice for large-scale assemblies. Although the "overlap - layout - consensus" approach proved to be useful in assembling clones, it faces difficulties in genomic assemblies: the existing algorithms make assembly errors even in bacterial genomes. We abandoned the "overlap - layout - consensus" approach in favour of a new Eulerian Superpath approach that outperforms the existing algorithms for genomic fragment assembly (Pevzner et al. 2001 InProceedings of the Fifth Annual International Conference on Computational Molecular Biology (RECOMB-01), 256-26). In this paper we describe our new EULER-DB algorithm that, similarly to the Celera assembler takes advantage of clone-end sequencing by using the double-barreled data. However, in contrast to the Celera assembler, EULER-DB does not mask repeats but uses them instead as a powerful tool for contig ordering. We also describe a new approach for the Copy Number Problem: "How many times a given repeat is present in the genome?". For long nearly-perfect repeats this question is notoriously difficult and some copies of such repeats may be "lost" in genomic assemblies. We describe our EULER-CN algorithm for the Copy Number Problem that proved to be successful in difficult sequencing projects.

Mesh:

Year:  2001        PMID: 11473013     DOI: 10.1093/bioinformatics/17.suppl_1.s225

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  33 in total

1.  Hierarchical scaffolding with Bambus.

Authors:  Mihai Pop; Daniel S Kosack; Steven L Salzberg
Journal:  Genome Res       Date:  2004-01       Impact factor: 9.043

2.  De novo repeat classification and fragment assembly.

Authors:  Pavel A Pevzner; Paul A Pevzner; Haixu Tang; Glenn Tesler
Journal:  Genome Res       Date:  2004-09       Impact factor: 9.043

3.  New Generations: Sequencing Machines and Their Computational Challenges.

Authors:  David C Schwartz; Michael S Waterman
Journal:  J Comput Sci Technol       Date:  2010-01-01       Impact factor: 1.571

4.  SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing.

Authors:  Anton Bankevich; Sergey Nurk; Dmitry Antipov; Alexey A Gurevich; Mikhail Dvorkin; Alexander S Kulikov; Valery M Lesin; Sergey I Nikolenko; Son Pham; Andrey D Prjibelski; Alexey V Pyshkin; Alexander V Sirotkin; Nikolay Vyahhi; Glenn Tesler; Max A Alekseyev; Pavel A Pevzner
Journal:  J Comput Biol       Date:  2012-04-16       Impact factor: 1.479

5.  Paired de bruijn graphs: a novel approach for incorporating mate pair information into genome assemblers.

Authors:  Paul Medvedev; Son Pham; Mark Chaisson; Glenn Tesler; Pavel Pevzner
Journal:  J Comput Biol       Date:  2011-10-14       Impact factor: 1.479

6.  Optimization of de novo transcriptome assembly from next-generation sequencing data.

Authors:  Yann Surget-Groba; Juan I Montoya-Burgos
Journal:  Genome Res       Date:  2010-08-06       Impact factor: 9.043

7.  Reconstructing cancer genomes from paired-end sequencing data.

Authors:  Layla Oesper; Anna Ritz; Sarah J Aerni; Ryan Drebin; Benjamin J Raphael
Journal:  BMC Bioinformatics       Date:  2012-04-19       Impact factor: 3.169

8.  Short read fragment assembly of bacterial genomes.

Authors:  Mark J Chaisson; Pavel A Pevzner
Journal:  Genome Res       Date:  2007-12-14       Impact factor: 9.043

9.  Genome assembly reborn: recent computational challenges.

Authors:  Mihai Pop
Journal:  Brief Bioinform       Date:  2009-05-29       Impact factor: 11.622

10.  De novo fragment assembly with short mate-paired reads: Does the read length matter?

Authors:  Mark J Chaisson; Dumitru Brinza; Pavel A Pevzner
Journal:  Genome Res       Date:  2008-12-03       Impact factor: 9.043

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.