Literature DB >> 25684202

DIME: a novel framework for de novo metagenomic sequence assembly.

Xuan Guo1, Ning Yu, Xiaojun Ding, Jianxin Wang, Yi Pan.   

Abstract

The recently developed next generation sequencing platforms not only decrease the cost for metagenomics data analysis, but also greatly enlarge the size of metagenomic sequence datasets. A common bottleneck of available assemblers is that the trade-off between the noise of the resulting contigs and the gain in sequence length for better annotation has not been attended enough for large-scale sequencing projects, especially for the datasets with low coverage and a large number of nonoverlapping contigs. To address this limitation and promote both accuracy and efficiency, we develop a novel metagenomic sequence assembly framework, DIME, by taking the DIvide, conquer, and MErge strategies. In addition, we give two MapReduce implementations of DIME, DIME-cap3 and DIME-genovo, on Apache Hadoop platform. For a systematic comparison of the performance of the assembly tasks, we tested DIME and five other popular short read assembly programs, Cap3, Genovo, MetaVelvet, SOAPdenovo, and SPAdes on four synthetic and three real metagenomic sequence datasets with various reads from fifty thousand to a couple million in size. The experimental results demonstrate that our method not only partitions the sequence reads with an extremely high accuracy, but also reconstructs more bases, generates higher quality assembled consensus, and yields higher assembly scores, including corrected N50 and BLAST-score-per-base, than other tools with a nearly theoretical speed-up. Results indicate that DIME offers great improvement in assembly across a range of sequence abundances and thus is robust to decreasing coverage.

Entities:  

Keywords:  algorithms; cloud computing; de novo assembly; metagenome; sequences

Mesh:

Year:  2015        PMID: 25684202      PMCID: PMC4326031          DOI: 10.1089/cmb.2014.0251

Source DB:  PubMed          Journal:  J Comput Biol        ISSN: 1066-5277            Impact factor:   1.479


  31 in total

1.  GAGE: A critical evaluation of genome assemblies and assembly algorithms.

Authors:  Steven L Salzberg; Adam M Phillippy; Aleksey Zimin; Daniela Puiu; Tanja Magoc; Sergey Koren; Todd J Treangen; Michael C Schatz; Arthur L Delcher; Michael Roberts; Guillaume Marçais; Mihai Pop; James A Yorke
Journal:  Genome Res       Date:  2012-01-06       Impact factor: 9.043

2.  Assembly of large genomes using second-generation sequencing.

Authors:  Michael C Schatz; Arthur L Delcher; Steven L Salzberg
Journal:  Genome Res       Date:  2010-05-27       Impact factor: 9.043

3.  Nucleotide composition string selection in HIV-1 subtyping using whole genomes.

Authors:  Xiaomeng Wu; Zhipeng Cai; Xiu-Feng Wan; Tin Hoang; Randy Goebel; Guohui Lin
Journal:  Bioinformatics       Date:  2007-05-11       Impact factor: 6.937

4.  Assembling single-cell genomes and mini-metagenomes from chimeric MDA products.

Authors:  Sergey Nurk; Anton Bankevich; Dmitry Antipov; Alexey A Gurevich; Anton Korobeynikov; Alla Lapidus; Andrey D Prjibelski; Alexey Pyshkin; Alexander Sirotkin; Yakov Sirotkin; Ramunas Stepanauskas; Scott R Clingenpeel; Tanja Woyke; Jeffrey S McLean; Roger Lasken; Glenn Tesler; Max A Alekseyev; Pavel A Pevzner
Journal:  J Comput Biol       Date:  2013-10       Impact factor: 1.479

5.  Metagenomic analysis of the human distal gut microbiome.

Authors:  Steven R Gill; Mihai Pop; Robert T Deboy; Paul B Eckburg; Peter J Turnbaugh; Buck S Samuel; Jeffrey I Gordon; David A Relman; Claire M Fraser-Liggett; Karen E Nelson
Journal:  Science       Date:  2006-06-02       Impact factor: 47.728

6.  Evaluating the fidelity of de novo short read metagenomic assembly using simulated data.

Authors:  Miguel Pignatelli; Andrés Moya
Journal:  PLoS One       Date:  2011-05-23       Impact factor: 3.240

7.  A comparison of rpoB and 16S rRNA as markers in pyrosequencing studies of bacterial diversity.

Authors:  Michiel Vos; Christopher Quince; Agata S Pijl; Mattias de Hollander; George A Kowalchuk
Journal:  PLoS One       Date:  2012-02-15       Impact factor: 3.240

8.  SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler.

Authors:  Ruibang Luo; Binghang Liu; Yinlong Xie; Zhenyu Li; Weihua Huang; Jianying Yuan; Guangzhu He; Yanxiang Chen; Qi Pan; Yunjie Liu; Jingbo Tang; Gengxiong Wu; Hao Zhang; Yujian Shi; Yong Liu; Chang Yu; Bo Wang; Yao Lu; Changlei Han; David W Cheung; Siu-Ming Yiu; Shaoliang Peng; Zhu Xiaoqian; Guangming Liu; Xiangke Liao; Yingrui Li; Huanming Yang; Jian Wang; Tak-Wah Lam; Jun Wang
Journal:  Gigascience       Date:  2012-12-27       Impact factor: 6.524

9.  Searching for SNPs with cloud computing.

Authors:  Ben Langmead; Michael C Schatz; Jimmy Lin; Mihai Pop; Steven L Salzberg
Journal:  Genome Biol       Date:  2009-11-20       Impact factor: 13.583

10.  Sequencing and de novo analysis of a coral larval transcriptome using 454 GSFlx.

Authors:  Eli Meyer; Galina V Aglyamova; Shi Wang; Jade Buchanan-Carter; David Abrego; John K Colbourne; Bette L Willis; Mikhail V Matz
Journal:  BMC Genomics       Date:  2009-05-12       Impact factor: 3.969

View more
  6 in total

Review 1.  Music of metagenomics-a review of its applications, analysis pipeline, and associated tools.

Authors:  Bilal Wajid; Faria Anwar; Imran Wajid; Haseeb Nisar; Sharoze Meraj; Ali Zafar; Mustafa Kamal Al-Shawaqfeh; Ali Riza Ekti; Asia Khatoon; Jan S Suchodolski
Journal:  Funct Integr Genomics       Date:  2021-10-18       Impact factor: 3.410

2.  Assessing the Impact of Assemblers on Virus Detection in a De Novo Metagenomic Analysis Pipeline.

Authors:  Daniel J White; Jing Wang; Richard J Hall
Journal:  J Comput Biol       Date:  2017-04-17       Impact factor: 1.479

3.  A deep learning method for lincRNA detection using auto-encoder algorithm.

Authors:  Ning Yu; Zeng Yu; Yi Pan
Journal:  BMC Bioinformatics       Date:  2017-12-06       Impact factor: 3.169

4.  SKESA: strategic k-mer extension for scrupulous assemblies.

Authors:  Alexandre Souvorov; Richa Agarwala; David J Lipman
Journal:  Genome Biol       Date:  2018-10-04       Impact factor: 13.583

Review 5.  An Integrated Multi-Disciplinary Perspectivefor Addressing Challenges of the Human Gut Microbiome.

Authors:  Rohan M Shah; Elizabeth J McKenzie; Magda T Rosin; Snehal R Jadhav; Shakuntla V Gondalia; Douglas Rosendale; David J Beale
Journal:  Metabolites       Date:  2020-03-06

6.  Deconvolute individual genomes from metagenome sequences through short read clustering.

Authors:  Kexue Li; Yakang Lu; Li Deng; Lili Wang; Lizhen Shi; Zhong Wang
Journal:  PeerJ       Date:  2020-04-08       Impact factor: 2.984

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.