| Literature DB >> 28967888 |
Alexander Sczyrba1,2, Peter Hofmann3,4,5, Peter Belmann1,2,4,5, David Koslicki6, Stefan Janssen4,7,8, Johannes Dröge3,4,5, Ivan Gregor3,4,5, Stephan Majda3, Jessika Fiedler3,4, Eik Dahms3,4,5, Andreas Bremges1,2,4,5,9, Adrian Fritz4,5, Ruben Garrido-Oter3,4,5,10,11, Tue Sparholt Jørgensen12,13,14, Nicole Shapiro15, Philip D Blood16, Alexey Gurevich17, Yang Bai10, Dmitrij Turaev18, Matthew Z DeMaere19, Rayan Chikhi20,21, Niranjan Nagarajan22, Christopher Quince23, Fernando Meyer4,5, Monika Balvočiūtė24, Lars Hestbjerg Hansen12, Søren J Sørensen13, Burton K H Chia22, Bertrand Denis22, Jeff L Froula15, Zhong Wang15, Robert Egan15, Dongwan Don Kang15, Jeffrey J Cook25, Charles Deltel26,27, Michael Beckstette28, Claire Lemaitre26,27, Pierre Peterlongo26,27, Guillaume Rizk27,29, Dominique Lavenier21,27, Yu-Wei Wu30,31, Steven W Singer30,32, Chirag Jain33, Marc Strous34, Heiner Klingenberg35, Peter Meinicke35, Michael D Barton15, Thomas Lingner36, Hsin-Hung Lin37, Yu-Chieh Liao37, Genivaldo Gueiros Z Silva38, Daniel A Cuevas38, Robert A Edwards38, Surya Saha39, Vitor C Piro40,41, Bernhard Y Renard40, Mihai Pop42,43, Hans-Peter Klenk44, Markus Göker45, Nikos C Kyrpides15, Tanja Woyke15, Julia A Vorholt46, Paul Schulze-Lefert10,11, Edward M Rubin15, Aaron E Darling19, Thomas Rattei18, Alice C McHardy3,4,5,11.
Abstract
Methods for assembly, taxonomic profiling and binning are key to interpreting metagenome data, but a lack of consensus about benchmarking complicates performance assessment. The Critical Assessment of Metagenome Interpretation (CAMI) challenge has engaged the global developer community to benchmark their programs on highly complex and realistic data sets, generated from ∼700 newly sequenced microorganisms and ∼600 novel viruses and plasmids and representing common experimental setups. Assembly and genome binning programs performed well for species represented by individual genomes but were substantially affected by the presence of related strains. Taxonomic profiling and binning programs were proficient at high taxonomic ranks, with a notable performance decrease below family level. Parameter settings markedly affected performance, underscoring their importance for program reproducibility. The CAMI results highlight current challenges but also provide a roadmap for software selection to answer specific research questions.Entities:
Mesh:
Year: 2017 PMID: 28967888 PMCID: PMC5903868 DOI: 10.1038/nmeth.4458
Source DB: PubMed Journal: Nat Methods ISSN: 1548-7091 Impact factor: 28.547