| Literature DB >> 22750883 |
Ali Bashir1, Aaron Klammer1, William P Robins2, Chen-Shan Chin1, Dale Webster1, Ellen Paxinos1, David Hsu1, Meredith Ashby1, Susana Wang1, Paul Peluso1, Robert Sebra1, Jon Sorenson1, James Bullard1, Jackie Yen1, Marie Valdovino1, Emilia Mollova1, Khai Luong1, Steven Lin1, Brianna LaMay1, Amruta Joshi1, Lori Rowe3, Michael Frace3, Cheryl L Tarr3, Maryann Turnsek3, Brigid M Davis4,2,5,6, Andrew Kasarskis1, John J Mekalanos2, Matthew K Waldor4,2,5,6, Eric E Schadt1,7.
Abstract
Advances in DNA sequencing technology have improved our ability to characterize most genomic diversity. However, accurate resolution of large structural events is challenging because of the short read lengths of second-generation technologies. Third-generation sequencing technologies, which can yield longer multikilobase reads, have the potential to address limitations associated with genome assembly. Here we combine sequencing data from second- and third-generation DNA sequencing technologies to assemble the two-chromosome genome of a recent Haitian cholera outbreak strain into two nearly finished contigs at >99.9% accuracy. Complex regions with clinically relevant structure were completely resolved. In separate control assemblies on experimental and simulated data for the canonical N16961 cholera reference strain, we obtained 14 scaffolds of greater than 1 kb for the experimental data and 8 scaffolds of greater than 1 kb for the simulated data, which allowed us to correct several errors in contigs assembled from the short-read data alone. This work provides a blueprint for the next generation of rapid microbial identification and full-genome assembly.Entities:
Mesh:
Year: 2012 PMID: 22750883 PMCID: PMC3731737 DOI: 10.1038/nbt.2288
Source DB: PubMed Journal: Nat Biotechnol ISSN: 1087-0156 Impact factor: 54.908