Literature DB >> 21385052

A novel abundance-based algorithm for binning metagenomic sequences using l-tuples.

Yu-Wei Wu1, Yuzhen Ye.   

Abstract

Metagenomics is the study of microbial communities sampled directly from their natural environment, without prior culturing. Among the computational tools recently developed for metagenomic sequence analysis, binning tools attempt to classify the sequences in a metagenomic dataset into different bins (i.e., species), based on various DNA composition patterns (e.g., the tetramer frequencies) of various genomes. Composition-based binning methods, however, cannot be used to classify very short fragments, because of the substantial variation of DNA composition patterns within a single genome. We developed a novel approach (AbundanceBin) for metagenomics binning by utilizing the different abundances of species living in the same environment. AbundanceBin is an application of the Lander-Waterman model to metagenomics, which is based on the l-tuple content of the reads. AbundanceBin achieved accurate, unsupervised, clustering of metagenomic sequences into different bins, such that the reads classified in a bin belong to species of identical or very similar abundances in the sample. In addition, AbundanceBin gave accurate estimations of species abundances, as well as their genome sizes-two important parameters for characterizing a microbial community. We also show that AbundanceBin performed well when the sequence lengths are very short (e.g., 75 bp) or have sequencing errors. By combining AbundanceBin and a composition-based method (MetaCluster), we can achieve even higher binning accuracy. Supplementary Material is available at www.liebertonline.com/cmb .

Mesh:

Substances:

Year:  2011        PMID: 21385052      PMCID: PMC3123841          DOI: 10.1089/cmb.2010.0245

Source DB:  PubMed          Journal:  J Comput Biol        ISSN: 1066-5277            Impact factor:   1.479


  31 in total

1.  TREE-PUZZLE: maximum likelihood phylogenetic analysis using quartets and parallel computing.

Authors:  Heiko A Schmidt; Korbinian Strimmer; Martin Vingron; Arndt von Haeseler
Journal:  Bioinformatics       Date:  2002-03       Impact factor: 6.937

2.  Estimating the repeat structure and length of DNA sequences using L-tuples.

Authors:  Xiaoman Li; Michael S Waterman
Journal:  Genome Res       Date:  2003-08       Impact factor: 9.043

3.  A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood.

Authors:  Stéphane Guindon; Olivier Gascuel
Journal:  Syst Biol       Date:  2003-10       Impact factor: 15.683

4.  Metagenomics: from acid mine to shining sea.

Authors:  Michael Y Galperin
Journal:  Environ Microbiol       Date:  2004-06       Impact factor: 5.491

5.  Community structure and metabolism through reconstruction of microbial genomes from the environment.

Authors:  Gene W Tyson; Jarrod Chapman; Philip Hugenholtz; Eric E Allen; Rachna J Ram; Paul M Richardson; Victor V Solovyev; Edward M Rubin; Daniel S Rokhsar; Jillian F Banfield
Journal:  Nature       Date:  2004-02-01       Impact factor: 49.962

6.  Genomic mapping by fingerprinting random clones: a mathematical analysis.

Authors:  E S Lander; M S Waterman
Journal:  Genomics       Date:  1988-04       Impact factor: 5.736

7.  TACOA: taxonomic classification of environmental genomic fragments using a kernelized nearest neighbor approach.

Authors:  Naryttza N Diaz; Lutz Krause; Alexander Goesmann; Karsten Niehaus; Tim W Nattkemper
Journal:  BMC Bioinformatics       Date:  2009-02-11       Impact factor: 3.169

8.  TETRA: a web-service and a stand-alone program for the analysis and comparison of tetranucleotide usage patterns in DNA sequences.

Authors:  Hanno Teeling; Jost Waldmann; Thierry Lombardot; Margarete Bauer; Frank Oliver Glöckner
Journal:  BMC Bioinformatics       Date:  2004-10-26       Impact factor: 3.169

9.  A simple, fast, and accurate method of phylogenomic inference.

Authors:  Martin Wu; Jonathan A Eisen
Journal:  Genome Biol       Date:  2008-10-13       Impact factor: 13.583

10.  A core gut microbiome in obese and lean twins.

Authors:  Peter J Turnbaugh; Micah Hamady; Tanya Yatsunenko; Brandi L Cantarel; Alexis Duncan; Ruth E Ley; Mitchell L Sogin; William J Jones; Bruce A Roe; Jason P Affourtit; Michael Egholm; Bernard Henrissat; Andrew C Heath; Rob Knight; Jeffrey I Gordon
Journal:  Nature       Date:  2008-11-30       Impact factor: 49.962

View more
  51 in total

1.  SolidBin: improving metagenome binning with semi-supervised normalized cut.

Authors:  Ziye Wang; Zhengyang Wang; Yang Young Lu; Fengzhu Sun; Shanfeng Zhu
Journal:  Bioinformatics       Date:  2019-11-01       Impact factor: 6.937

Review 2.  A review of methods and databases for metagenomic classification and assembly.

Authors:  Florian P Breitwieser; Jennifer Lu; Steven L Salzberg
Journal:  Brief Bioinform       Date:  2019-07-19       Impact factor: 11.622

3.  A concurrent subtractive assembly approach for identification of disease associated sub-metagenomes.

Authors:  Wontack Han; Mingjie Wang; Yuzhen Ye
Journal:  Res Comput Mol Biol       Date:  2017-04-12

Review 4.  Music of metagenomics-a review of its applications, analysis pipeline, and associated tools.

Authors:  Bilal Wajid; Faria Anwar; Imran Wajid; Haseeb Nisar; Sharoze Meraj; Ali Zafar; Mustafa Kamal Al-Shawaqfeh; Ali Riza Ekti; Asia Khatoon; Jan S Suchodolski
Journal:  Funct Integr Genomics       Date:  2021-10-18       Impact factor: 3.410

5.  MBMC: An Effective Markov Chain Approach for Binning Metagenomic Reads from Environmental Shotgun Sequencing Projects.

Authors:  Ying Wang; Haiyan Hu; Xiaoman Li
Journal:  OMICS       Date:  2016-07-22

6.  Libra: scalable k-mer-based tool for massive all-vs-all metagenome comparisons.

Authors:  Illyoung Choi; Alise J Ponsero; Matthew Bomhoff; Ken Youens-Clark; John H Hartman; Bonnie L Hurwitz
Journal:  Gigascience       Date:  2019-02-01       Impact factor: 6.524

7.  Binning unassembled short reads based on k-mer abundance covariance using sparse coding.

Authors:  Olexiy Kyrgyzov; Vincent Prost; Stéphane Gazut; Bruno Farcy; Thomas Brüls
Journal:  Gigascience       Date:  2020-04-01       Impact factor: 6.524

8.  MetaVelvet: an extension of Velvet assembler to de novo metagenome assembly from short sequence reads.

Authors:  Toshiaki Namiki; Tsuyoshi Hachiya; Hideaki Tanaka; Yasubumi Sakakibara
Journal:  Nucleic Acids Res       Date:  2012-07-19       Impact factor: 16.971

9.  MetaCluster 5.0: a two-round binning approach for metagenomic data for low-abundance species in a noisy sample.

Authors:  Yi Wang; Henry C M Leung; S M Yiu; Francis Y L Chin
Journal:  Bioinformatics       Date:  2012-09-15       Impact factor: 6.937

10.  "Snake-oil," "quack medicine," and "industrially cultured organisms:" biovalue and the commercialization of human microbiome research.

Authors:  Melody J Slashinski; Sheryl A McCurdy; Laura S Achenbaum; Simon N Whitney; Amy L McGuire
Journal:  BMC Med Ethics       Date:  2012-10-30       Impact factor: 2.652

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.