Dmitry Antipov1, Anton Korobeynikov2, Jeffrey S McLean3, Pavel A Pevzner4. 1. Center for Algorithmic Biotechnology, Institute for Translational Biomedicine. 2. Center for Algorithmic Biotechnology, Institute for Translational Biomedicine, Department of Statistical Modelling, St. Petersburg State University, St. Petersburg, Russia. 3. Department of Periodontics, University of Washington, Seattle, WA 98195, USA. 4. Center for Algorithmic Biotechnology, Institute for Translational Biomedicine, Department of Computer Science and Engineering, University of California, San Diego, USA and.
Abstract
MOTIVATION: Recent advances in single molecule real-time (SMRT) and nanopore sequencing technologies have enabled high-quality assemblies from long and inaccurate reads. However, these approaches require high coverage by long reads and remain expensive. On the other hand, the inexpensive short reads technologies produce accurate but fragmented assemblies. Thus, a hybrid approach that assembles long reads (with low coverage) and short reads has a potential to generate high-quality assemblies at reduced cost. RESULTS: We describe hybridSPAdes algorithm for assembling short and long reads and benchmark it on a variety of bacterial assembly projects. Our results demonstrate that hybridSPAdes generates accurate assemblies (even in projects with relatively low coverage by long reads) thus reducing the overall cost of genome sequencing. We further present the first complete assembly of a genome from single cells using SMRT reads. AVAILABILITY AND IMPLEMENTATION: hybridSPAdes is implemented in C++ as a part of SPAdes genome assembler and is publicly available at http://bioinf.spbau.ru/en/spades CONTACT: d.antipov@spbu.ru SUPPLEMENTARY INFORMATION: supplementary data are available at Bioinformatics online.
MOTIVATION: Recent advances in single molecule real-time (SMRT) and nanopore sequencing technologies have enabled high-quality assemblies from long and inaccurate reads. However, these approaches require high coverage by long reads and remain expensive. On the other hand, the inexpensive short reads technologies produce accurate but fragmented assemblies. Thus, a hybrid approach that assembles long reads (with low coverage) and short reads has a potential to generate high-quality assemblies at reduced cost. RESULTS: We describe hybridSPAdes algorithm for assembling short and long reads and benchmark it on a variety of bacterial assembly projects. Our results demonstrate that hybridSPAdes generates accurate assemblies (even in projects with relatively low coverage by long reads) thus reducing the overall cost of genome sequencing. We further present the first complete assembly of a genome from single cells using SMRT reads. AVAILABILITY AND IMPLEMENTATION: hybridSPAdes is implemented in C++ as a part of SPAdes genome assembler and is publicly available at http://bioinf.spbau.ru/en/spades CONTACT: d.antipov@spbu.ru SUPPLEMENTARY INFORMATION: supplementary data are available at Bioinformatics online.
Authors: Aleksey V Zimin; Guillaume Marçais; Daniela Puiu; Michael Roberts; Steven L Salzberg; James A Yorke Journal: Bioinformatics Date: 2013-08-29 Impact factor: 6.937
Authors: Konstantin Berlin; Sergey Koren; Chen-Shan Chin; James P Drake; Jane M Landolin; Adam M Phillippy Journal: Nat Biotechnol Date: 2015-05-25 Impact factor: 54.908
Authors: Chen-Shan Chin; David H Alexander; Patrick Marks; Aaron A Klammer; James Drake; Cheryl Heiner; Alicia Clum; Alex Copeland; John Huddleston; Evan E Eichler; Stephen W Turner; Jonas Korlach Journal: Nat Methods Date: 2013-05-05 Impact factor: 28.547
Authors: Jeffrey S McLean; Mary-Jane Lombardo; Jonathan H Badger; Anna Edlund; Mark Novotny; Joyclyn Yee-Greenbaum; Nikolay Vyahhi; Adam P Hall; Youngik Yang; Christopher L Dupont; Michael G Ziegler; Hamidreza Chitsaz; Andrew E Allen; Shibu Yooseph; Glenn Tesler; Pavel A Pevzner; Robert M Friedman; Kenneth H Nealson; J Craig Venter; Roger S Lasken Journal: Proc Natl Acad Sci U S A Date: 2013-06-10 Impact factor: 11.205
Authors: Filipe J Ribeiro; Dariusz Przybylski; Shuangye Yin; Ted Sharpe; Sante Gnerre; Amr Abouelleil; Aaron M Berlin; Anna Montmayeur; Terrance P Shea; Bruce J Walker; Sarah K Young; Carsten Russ; Chad Nusbaum; Iain MacCallum; David B Jaffe Journal: Genome Res Date: 2012-07-24 Impact factor: 9.043
Authors: Jessica M Labonté; Brandon K Swan; Bonnie Poulos; Haiwei Luo; Sergey Koren; Steven J Hallam; Matthew B Sullivan; Tanja Woyke; K Eric Wommack; Ramunas Stepanauskas Journal: ISME J Date: 2015-04-07 Impact factor: 10.302
Authors: Sergey Koren; Michael C Schatz; Brian P Walenz; Jeffrey Martin; Jason T Howard; Ganeshkumar Ganapathy; Zhong Wang; David A Rasko; W Richard McCombie; Erich D Jarvis Journal: Nat Biotechnol Date: 2012-07-01 Impact factor: 54.908
Authors: Sara Goodwin; James Gurtowski; Scott Ethe-Sayers; Panchajanya Deshpande; Michael C Schatz; W Richard McCombie Journal: Genome Res Date: 2015-10-07 Impact factor: 9.043
Authors: Andrey D Prjibelski; Irina Vasilinetc; Anton Bankevich; Alexey Gurevich; Tatiana Krivosheeva; Sergey Nurk; Son Pham; Anton Korobeynikov; Alla Lapidus; Pavel A Pevzner Journal: Bioinformatics Date: 2014-06-15 Impact factor: 6.937
Authors: Camila M Crnkovic; Jana Braesel; Aleksej Krunic; Alessandra S Eustáquio; Jimmy Orjala Journal: Chembiochem Date: 2019-11-26 Impact factor: 3.164
Authors: Yu Lin; Jeffrey Yuan; Mikhail Kolmogorov; Max W Shen; Mark Chaisson; Pavel A Pevzner Journal: Proc Natl Acad Sci U S A Date: 2016-12-12 Impact factor: 11.205
Authors: Nathan D Olson; Todd J Treangen; Christopher M Hill; Victoria Cepeda-Espinoza; Jay Ghurye; Sergey Koren; Mihai Pop Journal: Brief Bioinform Date: 2019-07-19 Impact factor: 11.622
Authors: Scott Quainoo; Jordy P M Coolen; Sacha A F T van Hijum; Martijn A Huynen; Willem J G Melchers; Willem van Schaik; Heiman F L Wertheim Journal: Clin Microbiol Rev Date: 2017-10 Impact factor: 26.132