Literature DB >> 22847406

Scaling metagenome sequence assembly with probabilistic de Bruijn graphs.

Jason Pell1, Arend Hintze, Rosangela Canino-Koning, Adina Howe, James M Tiedje, C Titus Brown.   

Abstract

Deep sequencing has enabled the investigation of a wide range of environmental microbial ecosystems, but the high memory requirements for de novo assembly of short-read shotgun sequencing data from these complex populations are an increasingly large practical barrier. Here we introduce a memory-efficient graph representation with which we can analyze the k-mer connectivity of metagenomic samples. The graph representation is based on a probabilistic data structure, a Bloom filter, that allows us to efficiently store assembly graphs in as little as 4 bits per k-mer, albeit inexactly. We show that this data structure accurately represents DNA assembly graphs in low memory. We apply this data structure to the problem of partitioning assembly graphs into components as a prelude to assembly, and show that this reduces the overall memory requirements for de novo assembly of metagenomes. On one soil metagenome assembly, this approach achieves a nearly 40-fold decrease in the maximum memory requirements for assembly. This probabilistic graph representation is a significant theoretical advance in storing assembly graphs and also yields immediate leverage on metagenomic assembly.

Entities:  

Mesh:

Substances:

Year:  2012        PMID: 22847406      PMCID: PMC3421212          DOI: 10.1073/pnas.1121464109

Source DB:  PubMed          Journal:  Proc Natl Acad Sci U S A        ISSN: 0027-8424            Impact factor:   11.205


  29 in total

1.  Metagenomic analysis of a permafrost microbial community reveals a rapid response to thaw.

Authors:  Rachel Mackelprang; Mark P Waldrop; Kristen M DeAngelis; Maude M David; Krystle L Chavarria; Steven J Blazewicz; Edward M Rubin; Janet K Jansson
Journal:  Nature       Date:  2011-11-06       Impact factor: 49.962

2.  GAGE: A critical evaluation of genome assemblies and assembly algorithms.

Authors:  Steven L Salzberg; Adam M Phillippy; Aleksey Zimin; Daniela Puiu; Tanja Magoc; Sergey Koren; Todd J Treangen; Michael C Schatz; Arthur L Delcher; Michael Roberts; Guillaume Marçais; Mihai Pop; James A Yorke
Journal:  Genome Res       Date:  2012-01-06       Impact factor: 9.043

3.  High-quality draft assemblies of mammalian genomes from massively parallel sequence data.

Authors:  Sante Gnerre; Iain Maccallum; Dariusz Przybylski; Filipe J Ribeiro; Joshua N Burton; Bruce J Walker; Ted Sharpe; Giles Hall; Terrance P Shea; Sean Sykes; Aaron M Berlin; Daniel Aird; Maura Costello; Riza Daza; Louise Williams; Robert Nicol; Andreas Gnirke; Chad Nusbaum; Eric S Lander; David B Jaffe
Journal:  Proc Natl Acad Sci U S A       Date:  2010-12-27       Impact factor: 11.205

4.  Velvet: algorithms for de novo short read assembly using de Bruijn graphs.

Authors:  Daniel R Zerbino; Ewan Birney
Journal:  Genome Res       Date:  2008-03-18       Impact factor: 9.043

5.  Succinct data structures for assembling large genomes.

Authors:  Thomas C Conway; Andrew J Bromage
Journal:  Bioinformatics       Date:  2011-01-17       Impact factor: 6.937

6.  MetaVelvet: an extension of Velvet assembler to de novo metagenome assembly from short sequence reads.

Authors:  Toshiaki Namiki; Tsuyoshi Hachiya; Hideaki Tanaka; Yasubumi Sakakibara
Journal:  Nucleic Acids Res       Date:  2012-07-19       Impact factor: 16.971

7.  DecGPU: distributed error correction on massively parallel graphics processing units using CUDA and MPI.

Authors:  Yongchao Liu; Bertil Schmidt; Douglas L Maskell
Journal:  BMC Bioinformatics       Date:  2011-03-29       Impact factor: 3.169

8.  The Earth Microbiome Project: Meeting report of the "1 EMP meeting on sample selection and acquisition" at Argonne National Laboratory October 6 2010.

Authors:  Jack A Gilbert; Folker Meyer; Janet Jansson; Jeff Gordon; Norman Pace; James Tiedje; Ruth Ley; Noah Fierer; Dawn Field; Nikos Kyrpides; Frank-Oliver Glöckner; Hans-Peter Klenk; K Eric Wommack; Elizabeth Glass; Kathryn Docherty; Rachel Gallery; Rick Stevens; Rob Knight
Journal:  Stand Genomic Sci       Date:  2010-12-25

9.  Quake: quality-aware detection and correction of sequencing errors.

Authors:  David R Kelley; Michael C Schatz; Steven L Salzberg
Journal:  Genome Biol       Date:  2010-11-29       Impact factor: 13.583

10.  Full-length transcriptome assembly from RNA-Seq data without a reference genome.

Authors:  Manfred G Grabherr; Brian J Haas; Moran Yassour; Joshua Z Levin; Dawn A Thompson; Ido Amit; Xian Adiconis; Lin Fan; Raktima Raychowdhury; Qiandong Zeng; Zehua Chen; Evan Mauceli; Nir Hacohen; Andreas Gnirke; Nicholas Rhind; Federica di Palma; Bruce W Birren; Chad Nusbaum; Kerstin Lindblad-Toh; Nir Friedman; Aviv Regev
Journal:  Nat Biotechnol       Date:  2011-05-15       Impact factor: 54.908

View more
  86 in total

1.  Strain recovery from metagenomes.

Authors:  C Titus Brown
Journal:  Nat Biotechnol       Date:  2015-10       Impact factor: 54.908

2.  METHODS TO ENSURE THE REPRODUCIBILITY OF BIOMEDICAL RESEARCH.

Authors:  Konrad J Karczewski; Nicholas P Tatonetti; Arjun K Manrai; Chirag J Patel; C Titus Brown; John P A Ioannidis
Journal:  Pac Symp Biocomput       Date:  2017

Review 3.  Sequence assembly demystified.

Authors:  Niranjan Nagarajan; Mihai Pop
Journal:  Nat Rev Genet       Date:  2013-01-29       Impact factor: 53.242

4.  Tackling soil diversity with the assembly of large, complex metagenomes.

Authors:  Adina Chuang Howe; Janet K Jansson; Stephanie A Malfatti; Susannah G Tringe; James M Tiedje; C Titus Brown
Journal:  Proc Natl Acad Sci U S A       Date:  2014-03-14       Impact factor: 11.205

5.  Fast lossless compression via cascading Bloom filters.

Authors:  Roye Rozov; Ron Shamir; Eran Halperin
Journal:  BMC Bioinformatics       Date:  2014-09-10       Impact factor: 3.169

6.  Improving Bloom Filter Performance on Sequence Data Using k-mer Bloom Filters.

Authors:  David Pellow; Darya Filippova; Carl Kingsford
Journal:  J Comput Biol       Date:  2016-11-09       Impact factor: 1.479

7.  DIME: a novel framework for de novo metagenomic sequence assembly.

Authors:  Xuan Guo; Ning Yu; Xiaojun Ding; Jianxin Wang; Yi Pan
Journal:  J Comput Biol       Date:  2015-02       Impact factor: 1.479

Review 8.  Shotgun metagenomics, from sampling to analysis.

Authors:  Christopher Quince; Alan W Walker; Jared T Simpson; Nicholas J Loman; Nicola Segata
Journal:  Nat Biotechnol       Date:  2017-09-12       Impact factor: 54.908

9.  Fast search of thousands of short-read sequencing experiments.

Authors:  Brad Solomon; Carl Kingsford
Journal:  Nat Biotechnol       Date:  2016-02-08       Impact factor: 54.908

10.  Detection of low-abundance bacterial strains in metagenomic datasets by eigengenome partitioning.

Authors:  Brian Cleary; Ilana Lauren Brito; Katherine Huang; Dirk Gevers; Terrance Shea; Sarah Young; Eric J Alm
Journal:  Nat Biotechnol       Date:  2015-09-14       Impact factor: 54.908

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.