Literature DB >> 22697249

A de Bruijn graph approach to the quantification of closely-related genomes in a microbial community.

Mingjie Wang1, Yuzhen Ye, Haixu Tang.   

Abstract

The wide applications of next-generation sequencing (NGS) technologies in metagenomics have raised many computational challenges. One of the essential problems in metagenomics is to estimate the taxonomic composition of a microbial community, which can be approached by mapping shotgun reads acquired from the community to previously characterized microbial genomes followed by quantity profiling of these species based on the number of mapped reads. This procedure, however, is not as trivial as it appears at first glance. A shotgun metagenomic dataset often contains DNA sequences from many closely-related microbial species (e.g., within the same genus) or strains (e.g., within the same species), thus it is often difficult to determine which species/strain a specific read is sampled from when it can be mapped to a common region shared by multiple genomes at high similarity. Furthermore, high genomic variations are observed among individual genomes within the same species, which are difficult to be differentiated from the inter-species variations during reads mapping. To address these issues, a commonly used approach is to quantify taxonomic distribution only at the genus level, based on the reads mapped to all species belonging to the same genus; alternatively, reads are mapped to a set of representative genomes, each selected to represent a different genus. Here, we introduce a novel approach to the quantity estimation of closely-related species within the same genus by mapping the reads to their genomes represented by a de Bruijn graph, in which the common genomic regions among them are collapsed. Using simulated and real metagenomic datasets, we show the de Bruijn graph approach has several advantages over existing methods, including (1) it avoids redundant mapping of shotgun reads to multiple copies of the common regions in different genomes, and (2) it leads to more accurate quantification for the closely-related species (and even for strains within the same species).

Mesh:

Year:  2012        PMID: 22697249      PMCID: PMC3375647          DOI: 10.1089/cmb.2012.0058

Source DB:  PubMed          Journal:  J Comput Biol        ISSN: 1066-5277            Impact factor:   1.479


  24 in total

1.  De novo repeat classification and fragment assembly.

Authors:  Pavel A Pevzner; Paul A Pevzner; Haixu Tang; Glenn Tesler
Journal:  Genome Res       Date:  2004-09       Impact factor: 9.043

Review 2.  Multiple sequence alignment: in pursuit of homologous DNA positions.

Authors:  Sudhir Kumar; Alan Filipski
Journal:  Genome Res       Date:  2007-02       Impact factor: 9.043

3.  Velvet: algorithms for de novo short read assembly using de Bruijn graphs.

Authors:  Daniel R Zerbino; Ewan Birney
Journal:  Genome Res       Date:  2008-03-18       Impact factor: 9.043

4.  Statistical inferences for isoform expression in RNA-Seq.

Authors:  Hui Jiang; Wing Hung Wong
Journal:  Bioinformatics       Date:  2009-02-25       Impact factor: 6.937

5.  Identification and Quantification of Abundant Species from Pyrosequences of 16S rRNA by Consensus Alignment.

Authors:  Yuzhen Ye
Journal:  Proceedings (IEEE Int Conf Bioinformatics Biomed)       Date:  2011-02-04

6.  Enterotypes of the human gut microbiome.

Authors:  Manimozhiyan Arumugam; Jeroen Raes; Eric Pelletier; Denis Le Paslier; Takuji Yamada; Daniel R Mende; Gabriel R Fernandes; Julien Tap; Thomas Bruls; Jean-Michel Batto; Marcelo Bertalan; Natalia Borruel; Francesc Casellas; Leyden Fernandez; Laurent Gautier; Torben Hansen; Masahira Hattori; Tetsuya Hayashi; Michiel Kleerebezem; Ken Kurokawa; Marion Leclerc; Florence Levenez; Chaysavanh Manichanh; H Bjørn Nielsen; Trine Nielsen; Nicolas Pons; Julie Poulain; Junjie Qin; Thomas Sicheritz-Ponten; Sebastian Tims; David Torrents; Edgardo Ugarte; Erwin G Zoetendal; Jun Wang; Francisco Guarner; Oluf Pedersen; Willem M de Vos; Søren Brunak; Joel Doré; María Antolín; François Artiguenave; Hervé M Blottiere; Mathieu Almeida; Christian Brechot; Carlos Cara; Christian Chervaux; Antonella Cultrone; Christine Delorme; Gérard Denariaz; Rozenn Dervyn; Konrad U Foerstner; Carsten Friss; Maarten van de Guchte; Eric Guedon; Florence Haimet; Wolfgang Huber; Johan van Hylckama-Vlieg; Alexandre Jamet; Catherine Juste; Ghalia Kaci; Jan Knol; Omar Lakhdari; Severine Layec; Karine Le Roux; Emmanuelle Maguin; Alexandre Mérieux; Raquel Melo Minardi; Christine M'rini; Jean Muller; Raish Oozeer; Julian Parkhill; Pierre Renault; Maria Rescigno; Nicolas Sanchez; Shinichi Sunagawa; Antonio Torrejon; Keith Turner; Gaetana Vandemeulebrouck; Encarna Varela; Yohanan Winogradsky; Georg Zeller; Jean Weissenbach; S Dusko Ehrlich; Peer Bork
Journal:  Nature       Date:  2011-04-20       Impact factor: 49.962

7.  Small variable segments constitute a major type of diversity of bacterial genomes at the species level.

Authors:  Fabrice Touzain; Erick Denamur; Claudine Médigue; Valérie Barbe; Meriem El Karoui; Marie-Agnès Petit
Journal:  Genome Biol       Date:  2010-04-30       Impact factor: 13.583

8.  Description of Treponema azotonutricium sp. nov. and Treponema primitia sp. nov., the first spirochetes isolated from termite guts.

Authors:  Joseph R Graber; Jared R Leadbetter; John A Breznak
Journal:  Appl Environ Microbiol       Date:  2004-03       Impact factor: 4.792

9.  Complete genome sequence of Treponema succinifaciens type strain (6091).

Authors:  Cliff Han; Sabine Gronow; Hazuki Teshima; Alla Lapidus; Matt Nolan; Susan Lucas; Nancy Hammon; Shweta Deshpande; Jan-Fang Cheng; Ahmed Zeytun; Roxanne Tapia; Lynne Goodwin; Sam Pitluck; Konstantinos Liolios; Ioanna Pagani; Natalia Ivanova; Konstantinos Mavromatis; Natalia Mikhailova; Marcel Huntemann; Amrita Pati; Amy Chen; Krishna Palaniappan; Miriam Land; Loren Hauser; Evelyne-Marie Brambilla; Manfred Rohde; Markus Göker; Tanja Woyke; James Bristow; Jonathan A Eisen; Victor Markowitz; Philip Hugenholtz; Nikos C Kyrpides; Hans-Peter Klenk; John C Detter
Journal:  Stand Genomic Sci       Date:  2011-06-30

10.  Simultaneous alignment of short reads against multiple genomes.

Authors:  Korbinian Schneeberger; Jörg Hagmann; Stephan Ossowski; Norman Warthmann; Sandra Gesing; Oliver Kohlbacher; Detlef Weigel
Journal:  Genome Biol       Date:  2009-09-17       Impact factor: 13.583

View more
  6 in total

1.  Clinical and ethical considerations of massively parallel sequencing in transplantation science.

Authors:  Andreas Scherer
Journal:  World J Transplant       Date:  2013-12-24

2.  Scalable, ultra-fast, and low-memory construction of compacted de Bruijn graphs with Cuttlefish 2.

Authors:  Jamshed Khan; Marek Kokot; Sebastian Deorowicz; Rob Patro
Journal:  Genome Biol       Date:  2022-09-08       Impact factor: 17.906

3.  Population-Sequencing as a Biomarker of Burkholderia mallei and Burkholderia pseudomallei Evolution through Microbial Forensic Analysis.

Authors:  John P Jakupciak; Jeffrey M Wells; Richard J Karalus; David R Pawlowski; Jeffrey S Lin; Andrew B Feldman
Journal:  J Nucleic Acids       Date:  2013-12-17

4.  Utilizing de Bruijn graph of metagenome assembly for metatranscriptome analysis.

Authors:  Yuzhen Ye; Haixu Tang
Journal:  Bioinformatics       Date:  2015-08-29       Impact factor: 6.937

5.  Read mapping on de Bruijn graphs.

Authors:  Antoine Limasset; Bastien Cazaux; Eric Rivals; Pierre Peterlongo
Journal:  BMC Bioinformatics       Date:  2016-06-16       Impact factor: 3.169

6.  Strand-specific community RNA-seq reveals prevalent and dynamic antisense transcription in human gut microbiota.

Authors:  Guanhui Bao; Mingjie Wang; Thomas G Doak; Yuzhen Ye
Journal:  Front Microbiol       Date:  2015-09-01       Impact factor: 5.640

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.