Literature DB >> 22300323

MetaCluster 4.0: a novel binning algorithm for NGS reads and huge number of species.

Yi Wang1, Henry C M Leung, S M Yiu, Francis Y L Chin.   

Abstract

Next-generation sequencing (NGS) technologies allow the sequencing of microbial communities directly from the environment without prior culturing. The output of environmental DNA sequencing consists of many reads from genomes of different unknown species, making the clustering together reads from the same (or similar) species (also known as binning) a crucial step. The difficulties of the binning problem are due to the following four factors: (1) the lack of reference genomes; (2) uneven abundance ratio of species; (3) short NGS reads; and (4) a large number of species (can be more than a hundred). None of the existing binning tools can handle all four factors. No tools, including both AbundanceBin and MetaCluster 3.0, have demonstrated reasonable performance on a sample with more than 20 species. In this article, we introduce MetaCluster 4.0, an unsupervised binning algorithm that can accurately (with about 80% precision and sensitivity in all cases and at least 90% in some cases) and efficiently bin short reads with varying abundance ratios and is able to handle datasets with 100 species. The novelty of MetaCluster 4.0 stems from solving a few important problems: how to divide reads into groups by a probabilistic approach, how to estimate the 4-mer distribution of each group, how to estimate the number of species, and how to modify MetaCluster 3.0 to handle a large number of species. We show that Meta Cluster 4.0 is effective for both simulated and real datasets. Supplementary Material is available at www.liebertonline.com/cmb.

Mesh:

Year:  2012        PMID: 22300323     DOI: 10.1089/cmb.2011.0276

Source DB:  PubMed          Journal:  J Comput Biol        ISSN: 1066-5277            Impact factor:   1.479


  27 in total

Review 1.  A review of methods and databases for metagenomic classification and assembly.

Authors:  Florian P Breitwieser; Jennifer Lu; Steven L Salzberg
Journal:  Brief Bioinform       Date:  2019-07-19       Impact factor: 11.622

Review 2.  The future is now: single-cell genomics of bacteria and archaea.

Authors:  Paul C Blainey
Journal:  FEMS Microbiol Rev       Date:  2013-02-11       Impact factor: 16.408

3.  Deciphering Cyanide-Degrading Potential of Bacterial Community Associated with the Coking Wastewater Treatment Plant with a Novel Draft Genome.

Authors:  Zhiping Wang; Lili Liu; Feng Guo; Tong Zhang
Journal:  Microb Ecol       Date:  2015-04-26       Impact factor: 4.552

4.  Metabolic characteristics of a glycogen-accumulating organism in Defluviicoccus cluster II revealed by comparative genomics.

Authors:  Zhiping Wang; Feng Guo; Yanping Mao; Yu Xia; Tong Zhang
Journal:  Microb Ecol       Date:  2014-06-03       Impact factor: 4.552

5.  Metagenome Analysis of a Complex Community Reveals the Metabolic Blueprint of Anammox Bacterium "Candidatus Jettenia asiatica".

Authors:  Ziye Hu; D R Speth; Kees-Jan Francoijs; Zhe-Xue Quan; M S M Jetten
Journal:  Front Microbiol       Date:  2012-10-29       Impact factor: 5.640

Review 6.  Systems-based approaches to unravel multi-species microbial community functioning.

Authors:  Florence Abram
Journal:  Comput Struct Biotechnol J       Date:  2014-12-03       Impact factor: 7.271

7.  Diversity and functions of bacterial community in drinking water biofilms revealed by high-throughput sequencing.

Authors:  Yuanqing Chao; Yanping Mao; Zhiping Wang; Tong Zhang
Journal:  Sci Rep       Date:  2015-06-12       Impact factor: 4.379

8.  Exploiting topic modeling to boost metagenomic reads binning.

Authors:  Ruichang Zhang; Zhanzhan Cheng; Jihong Guan; Shuigeng Zhou
Journal:  BMC Bioinformatics       Date:  2015-03-18       Impact factor: 3.169

9.  MetaCluster 5.0: a two-round binning approach for metagenomic data for low-abundance species in a noisy sample.

Authors:  Yi Wang; Henry C M Leung; S M Yiu; Francis Y L Chin
Journal:  Bioinformatics       Date:  2012-09-15       Impact factor: 6.937

10.  Compareads: comparing huge metagenomic experiments.

Authors:  Nicolas Maillet; Claire Lemaitre; Rayan Chikhi; Dominique Lavenier; Pierre Peterlongo
Journal:  BMC Bioinformatics       Date:  2012-12-19       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.