Literature DB >> 21525143

A large-scale benchmark study of existing algorithms for taxonomy-independent microbial community analysis.

Yijun Sun1, Yunpeng Cai, Susan M Huse, Rob Knight, William G Farmerie, Xiaoyu Wang, Volker Mai.   

Abstract

Recent advances in massively parallel sequencing technology have created new opportunities to probe the hidden world of microbes. Taxonomy-independent clustering of the 16S rRNA gene is usually the first step in analyzing microbial communities. Dozens of algorithms have been developed in the last decade, but a comprehensive benchmark study is lacking. Here, we survey algorithms currently used by microbiologists, and compare seven representative methods in a large-scale benchmark study that addresses several issues of concern. A new experimental protocol was developed that allows different algorithms to be compared using the same platform, and several criteria were introduced to facilitate a quantitative evaluation of the clustering performance of each algorithm. We found that existing methods vary widely in their outputs, and that inappropriate use of distance levels for taxonomic assignments likely resulted in substantial overestimates of biodiversity in many studies. The benchmark study identified our recently developed ESPRIT-Tree, a fast implementation of the average linkage-based hierarchical clustering algorithm, as one of the best algorithms available in terms of computational efficiency and clustering accuracy.

Mesh:

Substances:

Year:  2011        PMID: 21525143      PMCID: PMC3251834          DOI: 10.1093/bib/bbr009

Source DB:  PubMed          Journal:  Brief Bioinform        ISSN: 1467-5463            Impact factor:   11.622


  29 in total

1.  Clustering of highly homologous sequences to reduce the size of large protein databases.

Authors:  W Li; L Jaroszewski; A Godzik
Journal:  Bioinformatics       Date:  2001-03       Impact factor: 6.937

2.  MUSCLE: multiple sequence alignment with high accuracy and high throughput.

Authors:  Robert C Edgar
Journal:  Nucleic Acids Res       Date:  2004-03-19       Impact factor: 16.971

3.  Basic local alignment search tool.

Authors:  S F Altschul; W Gish; W Miller; E W Myers; D J Lipman
Journal:  J Mol Biol       Date:  1990-10-05       Impact factor: 5.469

4.  Introducing DOTUR, a computer program for defining operational taxonomic units and estimating species richness.

Authors:  Patrick D Schloss; Jo Handelsman
Journal:  Appl Environ Microbiol       Date:  2005-03       Impact factor: 4.792

5.  Microbial diversity in the deep sea and the underexplored "rare biosphere".

Authors:  Mitchell L Sogin; Hilary G Morrison; Julie A Huber; David Mark Welch; Susan M Huse; Phillip R Neal; Jesus M Arrieta; Gerhard J Herndl
Journal:  Proc Natl Acad Sci U S A       Date:  2006-07-31       Impact factor: 11.205

6.  Molecular-phylogenetic characterization of microbial community imbalances in human inflammatory bowel diseases.

Authors:  Daniel N Frank; Allison L St Amand; Robert A Feldman; Edgar C Boedeker; Noam Harpaz; Norman R Pace
Journal:  Proc Natl Acad Sci U S A       Date:  2007-08-15       Impact factor: 11.205

7.  ESPRIT-Tree: hierarchical clustering analysis of millions of 16S rRNA pyrosequences in quasilinear computational time.

Authors:  Yunpeng Cai; Yijun Sun
Journal:  Nucleic Acids Res       Date:  2011-05-19       Impact factor: 16.971

8.  Environmental shotgun sequencing: its potential and challenges for studying the hidden world of microbes.

Authors:  Jonathan A Eisen
Journal:  PLoS Biol       Date:  2007-03       Impact factor: 8.029

9.  NAST: a multiple sequence alignment server for comparative analysis of 16S rRNA genes.

Authors:  T Z DeSantis; P Hugenholtz; K Keller; E L Brodie; N Larsen; Y M Piceno; R Phan; G L Andersen
Journal:  Nucleic Acids Res       Date:  2006-07-01       Impact factor: 16.971

10.  MAFFT version 5: improvement in accuracy of multiple sequence alignment.

Authors:  Kazutaka Katoh; Kei-ichi Kuma; Hiroyuki Toh; Takashi Miyata
Journal:  Nucleic Acids Res       Date:  2005-01-20       Impact factor: 16.971

View more
  58 in total

1.  Secondary structure information does not improve OTU assignment for partial 16s rRNA sequences.

Authors:  Xiaoyu Wang; Yunpeng Cai; Yijun Sun; Rob Knight; Volker Mai
Journal:  ISME J       Date:  2011-12-15       Impact factor: 10.302

2.  Distribution-based clustering: using ecology to refine the operational taxonomic unit.

Authors:  Sarah P Preheim; Allison R Perrotta; Antonio M Martin-Platero; Anika Gupta; Eric J Alm
Journal:  Appl Environ Microbiol       Date:  2013-08-23       Impact factor: 4.792

Review 3.  Lung inflammation and disease: A perspective on microbial homeostasis and metabolism.

Authors:  Roberto Mendez; Sulagna Banerjee; Sanjoy K Bhattacharya; Santanu Banerjee
Journal:  IUBMB Life       Date:  2018-11-22       Impact factor: 3.885

4.  Disturbance and temporal partitioning of the activated sludge metacommunity.

Authors:  David C Vuono; Jan Benecke; Jochen Henkel; William C Navidi; Tzahi Y Cath; Junko Munakata-Marr; John R Spear; Jörg E Drewes
Journal:  ISME J       Date:  2014-08-15       Impact factor: 10.302

5.  A critical analysis of state-of-the-art metagenomics OTU clustering algorithms.

Authors:  Ashaq Hussain Bhat; Puniethaa Prabhu; Kalpana Balakrishnan
Journal:  J Biosci       Date:  2019-12       Impact factor: 1.826

6.  Comparative metagenomic and rRNA microbial diversity characterization using archaeal and bacterial synthetic communities.

Authors:  Migun Shakya; Christopher Quince; James H Campbell; Zamin K Yang; Christopher W Schadt; Mircea Podar
Journal:  Environ Microbiol       Date:  2013-02-06       Impact factor: 5.491

7.  ESPRIT-Forest: Parallel clustering of massive amplicon sequence data in subquadratic time.

Authors:  Yunpeng Cai; Wei Zheng; Jin Yao; Yujie Yang; Volker Mai; Qi Mao; Yijun Sun
Journal:  PLoS Comput Biol       Date:  2017-04-24       Impact factor: 4.475

8.  The inconstant gut microbiota of Drosophila species revealed by 16S rRNA gene analysis.

Authors:  Adam C-N Wong; John M Chaston; Angela E Douglas
Journal:  ISME J       Date:  2013-05-30       Impact factor: 10.302

9.  An investigation of canine leptospiral antibodies in Tokyo and Yokohama. Comparison of Canine Positive rates between rapid microscopic agglutination test and Schüffner-Mochtar test.

Authors:  E Ryu; A Hasegawa; S Saegusa; H Ichiki
Journal:  Int J Zoonoses       Date:  1974-12

10.  MSClust: A Multi-Seeds based Clustering algorithm for microbiome profiling using 16S rRNA sequence.

Authors:  Wei Chen; Yongmei Cheng; Clarence Zhang; Shaowu Zhang; Hongyu Zhao
Journal:  J Microbiol Methods       Date:  2013-07-28       Impact factor: 2.363

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.