MOTIVATION: Metagenomic sequencing allows reconstruction of microbial genomes directly from environmental samples. Omega (overlap-graph metagenome assembler) was developed for assembling and scaffolding Illumina sequencing data of microbial communities. RESULTS: Omega found overlaps between reads using a prefix/suffix hash table. The overlap graph of reads was simplified by removing transitive edges and trimming short branches. Unitigs were generated based on minimum cost flow analysis of the overlap graph and then merged to contigs and scaffolds using mate-pair information. In comparison with three de Bruijn graph assemblers (SOAPdenovo, IDBA-UD and MetaVelvet), Omega provided comparable overall performance on a HiSeq 100-bp dataset and superior performance on a MiSeq 300-bp dataset. In comparison with Celera on the MiSeq dataset, Omega provided more continuous assemblies overall using a fraction of the computing time of existing overlap-layout-consensus assemblers. This indicates Omega can more efficiently assemble longer Illumina reads, and at deeper coverage, for metagenomic datasets. AVAILABILITY AND IMPLEMENTATION: Implemented in C++ with source code and binaries freely available at http://omega.omicsbio.org. Published by Oxford University Press 2014. This work is written by US Government employees and is in the public domain in the US.
MOTIVATION: Metagenomic sequencing allows reconstruction of microbial genomes directly from environmental samples. Omega (overlap-graph metagenome assembler) was developed for assembling and scaffolding Illumina sequencing data of microbial communities. RESULTS: Omega found overlaps between reads using a prefix/suffix hash table. The overlap graph of reads was simplified by removing transitive edges and trimming short branches. Unitigs were generated based on minimum cost flow analysis of the overlap graph and then merged to contigs and scaffolds using mate-pair information. In comparison with three de Bruijn graph assemblers (SOAPdenovo, IDBA-UD and MetaVelvet), Omega provided comparable overall performance on a HiSeq 100-bp dataset and superior performance on a MiSeq 300-bp dataset. In comparison with Celera on the MiSeq dataset, Omega provided more continuous assemblies overall using a fraction of the computing time of existing overlap-layout-consensus assemblers. This indicates Omega can more efficiently assemble longer Illumina reads, and at deeper coverage, for metagenomic datasets. AVAILABILITY AND IMPLEMENTATION: Implemented in C++ with source code and binaries freely available at http://omega.omicsbio.org. Published by Oxford University Press 2014. This work is written by US Government employees and is in the public domain in the US.
Authors: Ananda S Bhattacharjee; Sha Wu; Christopher E Lawson; Mike S M Jetten; Vikram Kapoor; Jorge W Santo Domingo; Katherine D McMahon; Daniel R Noguera; Ramesh Goel Journal: Environ Sci Technol Date: 2017-03-31 Impact factor: 9.028
Authors: Xutao Deng; Samia N Naccache; Terry Ng; Scot Federman; Linlin Li; Charles Y Chiu; Eric L Delwart Journal: Nucleic Acids Res Date: 2015-01-13 Impact factor: 16.971
Authors: Andrew D Huang; Chengwei Luo; Angela Pena-Gonzalez; Michael R Weigand; Cheryl L Tarr; Konstantinos T Konstantinidis Journal: Appl Environ Microbiol Date: 2017-01-17 Impact factor: 4.792