Literature DB >> 21999285

Paired de bruijn graphs: a novel approach for incorporating mate pair information into genome assemblers.

Paul Medvedev1, Son Pham, Mark Chaisson, Glenn Tesler, Pavel Pevzner.   

Abstract

The recent proliferation of next generation sequencing with short reads has enabled many new experimental opportunities but, at the same time, has raised formidable computational challenges in genome assembly. One of the key advances that has led to an improvement in contig lengths has been mate pairs, which facilitate the assembly of repeating regions. Mate pairs have been algorithmically incorporated into most next generation assemblers as various heuristic post-processing steps to correct the assembly graph or to link contigs into scaffolds. Such methods have allowed the identification of longer contigs than would be possible with single reads; however, they can still fail to resolve complex repeats. Thus, improved methods for incorporating mate pairs will have a strong effect on contig length in the future. Here, we introduce the paired de Bruijn graph, a generalization of the de Bruijn graph that incorporates mate pair information into the graph structure itself instead of analyzing mate pairs at a post-processing step. This graph has the potential to be used in place of the de Bruijn graph in any de Bruijn graph based assembler, maintaining all other assembly steps such as error-correction and repeat resolution. Through assembly results on simulated perfect data, we argue that this can effectively improve the contig sizes in assembly.

Mesh:

Year:  2011        PMID: 21999285      PMCID: PMC3216098          DOI: 10.1089/cmb.2011.0151

Source DB:  PubMed          Journal:  J Comput Biol        ISSN: 1066-5277            Impact factor:   1.479


  20 in total

1.  Fragment assembly with double-barreled data.

Authors:  P A Pevzner; H Tang
Journal:  Bioinformatics       Date:  2001       Impact factor: 6.937

2.  Paired de bruijn graphs: a novel approach for incorporating mate pair information into genome assemblers.

Authors:  Paul Medvedev; Son Pham; Mark Chaisson; Glenn Tesler; Pavel Pevzner
Journal:  J Comput Biol       Date:  2011-10-14       Impact factor: 1.479

3.  The fragment assembly string graph.

Authors:  Eugene W Myers
Journal:  Bioinformatics       Date:  2005-09-01       Impact factor: 6.937

4.  Genome 10K: a proposal to obtain whole-genome sequence for 10,000 vertebrate species.

Authors: 
Journal:  J Hered       Date:  2009-11-05       Impact factor: 2.645

5.  Human whole-genome shotgun sequencing.

Authors:  J L Weber; E W Myers
Journal:  Genome Res       Date:  1997-05       Impact factor: 9.043

6.  A new algorithm for DNA sequence assembly.

Authors:  R M Idury; M S Waterman
Journal:  J Comput Biol       Date:  1995       Impact factor: 1.479

7.  Toward simplifying and accurately formulating fragment assembly.

Authors:  E W Myers
Journal:  J Comput Biol       Date:  1995       Impact factor: 1.479

8.  ARACHNE: a whole-genome shotgun assembler.

Authors:  Serafim Batzoglou; David B Jaffe; Ken Stanley; Jonathan Butler; Sante Gnerre; Evan Mauceli; Bonnie Berger; Jill P Mesirov; Eric S Lander
Journal:  Genome Res       Date:  2002-01       Impact factor: 9.043

9.  ALLPATHS: de novo assembly of whole-genome shotgun microreads.

Authors:  Jonathan Butler; Iain MacCallum; Michael Kleber; Ilya A Shlyakhter; Matthew K Belmonte; Eric S Lander; Chad Nusbaum; David B Jaffe
Journal:  Genome Res       Date:  2008-03-13       Impact factor: 9.043

10.  Human genome sequencing using unchained base reads on self-assembling DNA nanoarrays.

Authors:  Radoje Drmanac; Andrew B Sparks; Matthew J Callow; Aaron L Halpern; Norman L Burns; Bahram G Kermani; Paolo Carnevali; Igor Nazarenko; Geoffrey B Nilsen; George Yeung; Fredrik Dahl; Andres Fernandez; Bryan Staker; Krishna P Pant; Jonathan Baccash; Adam P Borcherding; Anushka Brownley; Ryan Cedeno; Linsu Chen; Dan Chernikoff; Alex Cheung; Razvan Chirita; Benjamin Curson; Jessica C Ebert; Coleen R Hacker; Robert Hartlage; Brian Hauser; Steve Huang; Yuan Jiang; Vitali Karpinchyk; Mark Koenig; Calvin Kong; Tom Landers; Catherine Le; Jia Liu; Celeste E McBride; Matt Morenzoni; Robert E Morey; Karl Mutch; Helena Perazich; Kimberly Perry; Brock A Peters; Joe Peterson; Charit L Pethiyagoda; Kaliprasad Pothuraju; Claudia Richter; Abraham M Rosenbaum; Shaunak Roy; Jay Shafto; Uladzislau Sharanhovich; Karen W Shannon; Conrad G Sheppy; Michel Sun; Joseph V Thakuria; Anne Tran; Dylan Vu; Alexander Wait Zaranek; Xiaodi Wu; Snezana Drmanac; Arnold R Oliphant; William C Banyai; Bruce Martin; Dennis G Ballinger; George M Church; Clifford A Reid
Journal:  Science       Date:  2009-11-05       Impact factor: 47.728

View more
  20 in total

1.  SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing.

Authors:  Anton Bankevich; Sergey Nurk; Dmitry Antipov; Alexey A Gurevich; Mikhail Dvorkin; Alexander S Kulikov; Valery M Lesin; Sergey I Nikolenko; Son Pham; Andrey D Prjibelski; Alexey V Pyshkin; Alexander V Sirotkin; Nikolay Vyahhi; Glenn Tesler; Max A Alekseyev; Pavel A Pevzner
Journal:  J Comput Biol       Date:  2012-04-16       Impact factor: 1.479

2.  Paired de bruijn graphs: a novel approach for incorporating mate pair information into genome assemblers.

Authors:  Paul Medvedev; Son Pham; Mark Chaisson; Glenn Tesler; Pavel Pevzner
Journal:  J Comput Biol       Date:  2011-10-14       Impact factor: 1.479

3.  Pathset graphs: a novel approach for comprehensive utilization of paired reads in genome assembly.

Authors:  Son K Pham; Dmitry Antipov; Alexander Sirotkin; Glenn Tesler; Pavel A Pevzner; Max A Alekseyev
Journal:  J Comput Biol       Date:  2012-07-17       Impact factor: 1.479

4.  GAML: genome assembly by maximum likelihood.

Authors:  Vladimír Boža; Broňa Brejová; Tomáš Vinař
Journal:  Algorithms Mol Biol       Date:  2015-06-03       Impact factor: 1.405

5.  2-kupl: mapping-free variant detection from DNA-seq data of matched samples.

Authors:  Yunfeng Wang; Haoliang Xue; Christine Pourcel; Yang Du; Daniel Gautheret
Journal:  BMC Bioinformatics       Date:  2021-06-05       Impact factor: 3.169

6.  Empirical evaluation of methods for de novo genome assembly.

Authors:  Firaol Dida; Gangman Yi
Journal:  PeerJ Comput Sci       Date:  2021-07-09

7.  Telescoper: de novo assembly of highly repetitive regions.

Authors:  Ma'ayan Bresler; Sara Sheehan; Andrew H Chan; Yun S Song
Journal:  Bioinformatics       Date:  2012-09-15       Impact factor: 6.937

8.  Genome reassembly with high-throughput sequencing data.

Authors:  Nathaniel Parrish; Benjamin Sudakov; Eleazar Eskin
Journal:  BMC Genomics       Date:  2013-01-21       Impact factor: 3.969

Review 9.  Next-generation sequence assembly: four stages of data processing and computational challenges.

Authors:  Sara El-Metwally; Taher Hamza; Magdi Zakaria; Mohamed Helmy
Journal:  PLoS Comput Biol       Date:  2013-12-12       Impact factor: 4.475

10.  Why assembling plant genome sequences is so challenging.

Authors:  Manuel Gonzalo Claros; Rocío Bautista; Darío Guerrero-Fernández; Hicham Benzerki; Pedro Seoane; Noé Fernández-Pozo
Journal:  Biology (Basel)       Date:  2012-09-18
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.