Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 COCACOLA: binning metagenomic contigs using sequence COmposition, read CoverAge, CO-alignment and paired-end read LinkAge.

Literature DB >> 27256312

COCACOLA: binning metagenomic contigs using sequence COmposition, read CoverAge, CO-alignment and paired-end read LinkAge.

Yang Young Lu¹, Ting Chen^1,2, Jed A Fuhrman³, Fengzhu Sun^1,4.

Abstract

Motivation: The advent of next-generation sequencing technologies enables researchers to sequence complex microbial communities directly from the environment. Because assembly typically produces only genome fragments, also known as contigs, instead of an entire genome, it is crucial to group them into operational taxonomic units (OTUs) for further taxonomic profiling and down-streaming functional analysis. OTU clustering is also referred to as binning. We present COCACOLA, a general framework automatically bin contigs into OTUs based on sequence composition and coverage across multiple samples.
Results: The effectiveness of COCACOLA is demonstrated in both simulated and real datasets in comparison with state-of-art binning approaches such as CONCOCT, GroopM, MaxBin and MetaBAT. The superior performance of COCACOLA relies on two aspects. One is using L 1 distance instead of Euclidean distance for better taxonomic identification during initialization. More importantly, COCACOLA takes advantage of both hard clustering and soft clustering by sparsity regularization. In addition, the COCACOLA framework seamlessly embraces customized knowledge to facilitate binning accuracy. In our study, we have investigated two types of additional knowledge, the co-alignment to reference genomes and linkage of contigs provided by paired-end reads, as well as the ensemble of both. We find that both co-alignment and linkage information further improve binning in the majority of cases. COCACOLA is scalable and faster than CONCOCT, GroopM, MaxBin and MetaBAT. Availability and implementation: The software is available at https://github.com/younglululu/COCACOLA . Contact: fsun@usc.edu. Supplementary information: Supplementary data are available at Bioinformatics online.

Entities: Chemical

Mesh：

Year: 2017 PMID： 27256312 DOI： 10.1093/bioinformatics/btw290

Source DB: PubMed Journal: Bioinformatics ISSN： 1367-4803 Impact factor: 6.937

Keyword Cloud
Cited

58 in total

1. Galacturonate Metabolism in Anaerobic Chemostat Enrichment Cultures: Combined Fermentation and Acetogenesis by the Dominant sp. nov. "Candidatus Galacturonibacter soehngenii".

Authors: Laura C Valk; Jeroen Frank; Pilar de la Torre-Cortés; Max van 't Hof; Antonius J A van Maris; Jack T Pronk; Mark C M van Loosdrecht
Journal: Appl Environ Microbiol Date: 2018-08-31 Impact factor: 4.792

2. High-Level Abundances of Methanobacteriales and Syntrophobacterales May Help To Prevent Corrosion of Metal Sheet Piles.

Authors: Michiel H In 't Zandt; Nardy Kip; Jeroen Frank; Stefan Jansen; Johannes A van Veen; Mike S M Jetten; Cornelia U Welte
Journal: Appl Environ Microbiol Date: 2019-10-01 Impact factor: 4.792

3. SolidBin: improving metagenome binning with semi-supervised normalized cut.

Authors: Ziye Wang; Zhengyang Wang; Yang Young Lu; Fengzhu Sun; Shanfeng Zhu
Journal: Bioinformatics Date: 2019-11-01 Impact factor: 6.937

Review 4. A review of methods and databases for metagenomic classification and assembly.

Authors: Florian P Breitwieser; Jennifer Lu; Steven L Salzberg
Journal: Brief Bioinform Date: 2019-07-19 Impact factor: 11.622

5. Evaluating metagenomics tools for genome binning with real metagenomic datasets and CAMI datasets.

Authors: Yi Yue; Hao Huang; Zhao Qi; Hui-Min Dou; Xin-Yi Liu; Tian-Fei Han; Yue Chen; Xiang-Jun Song; You-Hua Zhang; Jian Tu
Journal: BMC Bioinformatics Date: 2020-07-28 Impact factor: 3.169

6. Towards enhanced and interpretable clustering/classification in integrative genomics.

Authors: Yang Young Lu; Jinchi Lv; Jed A Fuhrman; Fengzhu Sun
Journal: Nucleic Acids Res Date: 2017-11-16 Impact factor: 16.971

7. AMBER: Assessment of Metagenome BinnERs.

Authors: Fernando Meyer; Peter Hofmann; Peter Belmann; Ruben Garrido-Oter; Adrian Fritz; Alexander Sczyrba; Alice C McHardy
Journal: Gigascience Date: 2018-06-01 Impact factor: 6.524

Review 8. The spinal cord-gut-immune axis as a master regulator of health and neurological function after spinal cord injury.

Authors: Kristina A Kigerl; Kylie Zane; Kia Adams; Matthew B Sullivan; Phillip G Popovich
Journal: Exp Neurol Date: 2019-10-22 Impact factor: 5.330

9. Binning unassembled short reads based on k-mer abundance covariance using sparse coding.

Authors: Olexiy Kyrgyzov; Vincent Prost; Stéphane Gazut; Bruno Farcy; Thomas Brüls
Journal: Gigascience Date: 2020-04-01 Impact factor: 6.524

10. Extremophilic nitrite-oxidizing Chloroflexi from Yellowstone hot springs.

Authors: Eva Spieck; Michael Spohn; Katja Wendt; Eberhard Bock; Jessup Shively; Jeroen Frank; Daniela Indenbirken; Malik Alawi; Sebastian Lücker; Jennifer Hüpeden
Journal: ISME J Date: 2019-10-17 Impact factor: 10.302