Binbin Lai1, Ruogu Ding, Yang Li, Liping Duan, Huaiqiu Zhu. 1. State Key Lab for Turbulence and Complex Systems and Department of Biomedical Engineering, College of Engineering, Center for Theoretical Biology, Center for Protein Science, Peking University, Beijing, China.
Abstract
MOTIVATION: A high-quality assembly of reads generated from shotgun sequencing is a substantial step in metagenome projects. Although traditional assemblers have been employed in initial analysis of metagenomes, they cannot surmount the challenges created by the features of metagenomic data. RESULT: We present a de novo assembly approach and its implementation named MAP (metagenomic assembly program). Based on an improved overlap/layout/consensus (OLC) strategy incorporated with several special algorithms, MAP uses the mate pair information, resulting in being more applicable to shotgun DNA reads (recommended as >200 bp) currently widely used in metagenome projects. Results of extensive tests on simulated data show that MAP can be superior to both Celera and Phrap for typical longer reads by Sanger sequencing, as well as has an evident advantage over Celera, Newbler and the newest Genovo, for typical shorter reads by 454 sequencing. AVAILABILITY AND IMPLEMENTATION: The source code of MAP is distributed as open source under the GNU GPL license, the MAP program and all simulated datasets can be freely available at http://bioinfo.ctb.pku.edu.cn/MAP/
MOTIVATION: A high-quality assembly of reads generated from shotgun sequencing is a substantial step in metagenome projects. Although traditional assemblers have been employed in initial analysis of metagenomes, they cannot surmount the challenges created by the features of metagenomic data. RESULT: We present a de novo assembly approach and its implementation named MAP (metagenomic assembly program). Based on an improved overlap/layout/consensus (OLC) strategy incorporated with several special algorithms, MAP uses the mate pair information, resulting in being more applicable to shotgun DNA reads (recommended as >200 bp) currently widely used in metagenome projects. Results of extensive tests on simulated data show that MAP can be superior to both Celera and Phrap for typical longer reads by Sanger sequencing, as well as has an evident advantage over Celera, Newbler and the newest Genovo, for typical shorter reads by 454 sequencing. AVAILABILITY AND IMPLEMENTATION: The source code of MAP is distributed as open source under the GNU GPL license, the MAP program and all simulated datasets can be freely available at http://bioinfo.ctb.pku.edu.cn/MAP/
Authors: Saskia L Smits; Rogier Bodewes; Aritz Ruiz-González; Wolfgang Baumgärtner; Marion P Koopmans; Albert D M E Osterhaus; Anita C Schürch Journal: Front Microbiol Date: 2015-10-01 Impact factor: 5.640
Authors: Gordon M Daly; Richard M Leggett; William Rowe; Samuel Stubbs; Maxim Wilkinson; Ricardo H Ramirez-Gonzalez; Mario Caccamo; William Bernal; Jonathan L Heeney Journal: PLoS One Date: 2015-06-22 Impact factor: 3.240