Ezgi Özkurt1,2, Joachim Fritscher1,2, Nicola Soranzo2, Duncan Y K Ng1, Robert P Davey2, Mohammad Bahram3,4, Falk Hildebrand5,6. 1. Gut Microbes & Health, Quadram Institute Bioscience, Norwich Research Park, Norwich, Norfolk, NR4 7UQ, UK. 2. Earlham Institute, Norwich Research Park, Norwich, Norfolk, NR4 7UZ, UK. 3. Department of Ecology, Swedish University of Agricultural Sciences, Ulls väg 16, 756 51, Uppsala, Sweden. 4. Institute of Ecology and Earth Sciences, University of Tartu, Lai St, 40, Tartu, Estonia. 5. Gut Microbes & Health, Quadram Institute Bioscience, Norwich Research Park, Norwich, Norfolk, NR4 7UQ, UK. falk.hildebrand@quadram.ac.uk. 6. Earlham Institute, Norwich Research Park, Norwich, Norfolk, NR4 7UZ, UK. falk.hildebrand@quadram.ac.uk.
Abstract
BACKGROUND: Amplicon sequencing is an established and cost-efficient method for profiling microbiomes. However, many available tools to process this data require both bioinformatics skills and high computational power to process big datasets. Furthermore, there are only few tools that allow for long read amplicon data analysis. To bridge this gap, we developed the LotuS2 (less OTU scripts 2) pipeline, enabling user-friendly, resource friendly, and versatile analysis of raw amplicon sequences. RESULTS: In LotuS2, six different sequence clustering algorithms as well as extensive pre- and post-processing options allow for flexible data analysis by both experts, where parameters can be fully adjusted, and novices, where defaults are provided for different scenarios. We benchmarked three independent gut and soil datasets, where LotuS2 was on average 29 times faster compared to other pipelines, yet could better reproduce the alpha- and beta-diversity of technical replicate samples. Further benchmarking a mock community with known taxon composition showed that, compared to the other pipelines, LotuS2 recovered a higher fraction of correctly identified taxa and a higher fraction of reads assigned to true taxa (48% and 57% at species; 83% and 98% at genus level, respectively). At ASV/OTU level, precision and F-score were highest for LotuS2, as was the fraction of correctly reported 16S sequences. CONCLUSION: LotuS2 is a lightweight and user-friendly pipeline that is fast, precise, and streamlined, using extensive pre- and post-ASV/OTU clustering steps to further increase data quality. High data usage rates and reliability enable high-throughput microbiome analysis in minutes. AVAILABILITY: LotuS2 is available from GitHub, conda, or via a Galaxy web interface, documented at http://lotus2.earlham.ac.uk/ . Video Abstract.
BACKGROUND: Amplicon sequencing is an established and cost-efficient method for profiling microbiomes. However, many available tools to process this data require both bioinformatics skills and high computational power to process big datasets. Furthermore, there are only few tools that allow for long read amplicon data analysis. To bridge this gap, we developed the LotuS2 (less OTU scripts 2) pipeline, enabling user-friendly, resource friendly, and versatile analysis of raw amplicon sequences. RESULTS: In LotuS2, six different sequence clustering algorithms as well as extensive pre- and post-processing options allow for flexible data analysis by both experts, where parameters can be fully adjusted, and novices, where defaults are provided for different scenarios. We benchmarked three independent gut and soil datasets, where LotuS2 was on average 29 times faster compared to other pipelines, yet could better reproduce the alpha- and beta-diversity of technical replicate samples. Further benchmarking a mock community with known taxon composition showed that, compared to the other pipelines, LotuS2 recovered a higher fraction of correctly identified taxa and a higher fraction of reads assigned to true taxa (48% and 57% at species; 83% and 98% at genus level, respectively). At ASV/OTU level, precision and F-score were highest for LotuS2, as was the fraction of correctly reported 16S sequences. CONCLUSION: LotuS2 is a lightweight and user-friendly pipeline that is fast, precise, and streamlined, using extensive pre- and post-ASV/OTU clustering steps to further increase data quality. High data usage rates and reliability enable high-throughput microbiome analysis in minutes. AVAILABILITY: LotuS2 is available from GitHub, conda, or via a Galaxy web interface, documented at http://lotus2.earlham.ac.uk/ . Video Abstract.
Authors: Mohammad Bahram; Falk Hildebrand; Sofia K Forslund; Jennifer L Anderson; Nadejda A Soudzilovskaia; Peter M Bodegom; Johan Bengtsson-Palme; Sten Anslan; Luis Pedro Coelho; Helery Harend; Jaime Huerta-Cepas; Marnix H Medema; Mia R Maltz; Sunil Mundra; Pål Axel Olsson; Mari Pent; Sergei Põlme; Shinichi Sunagawa; Martin Ryberg; Leho Tedersoo; Peer Bork Journal: Nature Date: 2018-08-01 Impact factor: 49.962
Authors: R Henrik Nilsson; Sten Anslan; Mohammad Bahram; Christian Wurzbacher; Petr Baldrian; Leho Tedersoo Journal: Nat Rev Microbiol Date: 2019-01 Impact factor: 60.633
Authors: Daniel McDonald; Morgan N Price; Julia Goodrich; Eric P Nawrocki; Todd Z DeSantis; Alexander Probst; Gary L Andersen; Rob Knight; Philip Hugenholtz Journal: ISME J Date: 2011-12-01 Impact factor: 10.302
Authors: Vahid Jalili; Enis Afgan; Qiang Gu; Dave Clements; Daniel Blankenberg; Jeremy Goecks; James Taylor; Anton Nekrutenko Journal: Nucleic Acids Res Date: 2020-06-25 Impact factor: 16.971
Authors: Ezgi Özkurt; M Amine Hassani; Uğur Sesiz; Sven Künzel; Tal Dagan; Hakan Özkan; Eva H Stukenbrock Journal: mBio Date: 2020-11-17 Impact factor: 7.867