Motivation: The 16S ribosomal RNA (rRNA) gene is widely used to survey microbial communities. Sequences are often clustered into Operational Taxonomic Units (OTUs) as proxies for species. The canonical clustering threshold is 97% identity, which was proposed in 1994 when few 16S rRNA sequences were available, motivating a reassessment on current data. Results: Using a large set of high-quality 16S rRNA sequences from finished genomes, I assessed the correspondence of OTUs to species for five representative clustering algorithms using four accuracy metrics. All algorithms had comparable accuracy when tuned to a given metric. Optimal identity thresholds were ∼99% for full-length sequences and ∼100% for the V4 hypervariable region. Availability and implementation: Reference sequences and source code are provided in the Supplementary Material. Supplementary information: Supplementary data are available at Bioinformatics online.
Motivation: The 16S ribosomal RNA (rRNA) gene is widely used to survey microbial communities. Sequences are often clustered into Operational Taxonomic Units (OTUs) as proxies for species. The canonical clustering threshold is 97% identity, which was proposed in 1994 when few 16S rRNA sequences were available, motivating a reassessment on current data. Results: Using a large set of high-quality 16S rRNA sequences from finished genomes, I assessed the correspondence of OTUs to species for five representative clustering algorithms using four accuracy metrics. All algorithms had comparable accuracy when tuned to a given metric. Optimal identity thresholds were ∼99% for full-length sequences and ∼100% for the V4 hypervariable region. Availability and implementation: Reference sequences and source code are provided in the Supplementary Material. Supplementary information: Supplementary data are available at Bioinformatics online.
Authors: Sebastian Hupfauf; Mohammad Etemadi; Marina Fernández-Delgado Juárez; María Gómez-Brandón; Heribert Insam; Sabine Marie Podmirseg Journal: PLoS One Date: 2020-12-02 Impact factor: 3.240
Authors: Katherine A Maki; Ana F Diallo; Mark B Lockwood; Alexis T Franks; Stefan J Green; Paule V Joseph Journal: Biol Res Nurs Date: 2018-11-08 Impact factor: 2.522
Authors: Malte Christoph Rühlemann; Britt Marie Hermes; Corinna Bang; Shauni Doms; Lucas Moitinho-Silva; Louise Bruun Thingholm; Fabian Frost; Frauke Degenhardt; Michael Wittig; Jan Kässens; Frank Ulrich Weiss; Annette Peters; Klaus Neuhaus; Uwe Völker; Henry Völzke; Georg Homuth; Stefan Weiss; Harald Grallert; Matthias Laudes; Wolfgang Lieb; Dirk Haller; Markus M Lerch; John F Baines; Andre Franke Journal: Nat Genet Date: 2021-01-18 Impact factor: 38.330
Authors: Paula Huber; Francisco M Cornejo-Castillo; Isabel Ferrera; Pablo Sánchez; Ramiro Logares; Sebastián Metz; Vanessa Balagué; Silvia G Acinas; Josep M Gasol; Fernando Unrein Journal: Appl Environ Microbiol Date: 2019-03-22 Impact factor: 4.792
Authors: Julia E M Stuart; Hannah Holland-Moritz; Mélanie Jean; Samantha N Miller; José Miguel Ponciano; Stuart F McDaniel; Michelle C Mack Journal: Oecologia Date: 2021-07-28 Impact factor: 3.225