Literature DB >> 34117876

ITSoneWB: profiling global taxonomic diversity of eukaryotic communities on Galaxy.

Marco Tangaro1, Giuseppe Defazio2, Bruno Fosso1, Vito Flavio Licciulli3, Giorgio Grillo3, Giacinto Donvito4, Enrico Lavezzo5, Giacomo Baruzzo6, Graziano Pesole1,2, Monica Santamaria1.   

Abstract

MOTIVATION: ITSoneWB (ITSone WorkBench) is a Galaxy-based bioinformatic environment where comprehensive and high-quality reference data are connected with established pipelines and new tools in an automated and easy-to-use service targeted at global taxonomic analysis of eukaryotic communities based on Internal Transcribed Spacer 1 variants high-throughput sequencing. AVAILABILITY: ITSoneWB has been deployed on the INFN-Bari ReCaS cloud facility and is freely available on the web at http://itsonewb.cloud.ba.infn.it/galaxy. SUPPLEMENTARY INFORMATION: Supplementary data are available at https://github.com/ibiom-cnr/itsonewb/wiki.
© The Author(s) 2021. Published by Oxford University Press.

Entities:  

Year:  2021        PMID: 34117876      PMCID: PMC9502156          DOI: 10.1093/bioinformatics/btab431

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.931


1 Introduction

The amplicon targeted metagenomic analysis (here referred as DNA metabarcoding), in which taxon-related variants of selected genetic markers from environmental samples are explored through High-Throughput Sequencing (HTS) technologies, is largely applied to unravel the global composition of biotic communities in a rapid, massive and cheap way. After the widespread success of this approach in prokaryotic studies, a growing number of researchers interested in eukaryotic communities are now encouraged to use it. This results in an urgent need of increasingly comprehensive, well-controlled and FAIR-compliant (Wilkinson ) reference databases targeted to this domain, interfaced with established annotation tools and consistent taxonomies. Such a topic is becoming a primary target of some of the most important bioinformatics infrastructures as, for example, those established in Elixir and LifeWatch European ESFRI projects. In this framework, we developed the ITSone WorkBench (ITSoneWB), where comprehensive and high-quality Internal Transcribed Spacer 1 (ITS1) reference data, DNA metabarcoding well-established analysis pipelines and new tools are integrated in an easy-to-use service addressing the eukaryotic domain of life.

2 ITSoneWB rationale and novelty

A growing number of evidence have highlighted the great potential of ITS1 in discriminating Eukaryotes at deeper taxonomic levels, particularly in Fungi (Cheng ; Usyk ; Wang ). ITSoneWB provides the first integrated bioinformatic environment specifically targeted at this promising marker, where the user can easily submit his own sequences, access high-quality reference information and tools, and use them in customized and automated workflows without worrying about intermediate bioinformatics steps, often critical when data flow through different tools with different format requirements. ITSoneDB (Santamaria ), to our knowledge the first and unique controlled and taxonomically referenced specialized collection of eukaryotic ITS1 sequences, is the core of the WorkBench. Last update, release 1.138 (March 2019), hosts 1 174 761 ITS1 sequences spanning 157 531 eukaryotic species. A section of the database, with 46 375 sequences belonging to 4115 species, is entirely dedicated to marine habitat. ITSoneDB has been developed in the framework of ELIXIR EXCELERATE project in order to enhance bioinformatic resources for metagenomic studies targeted to this particularly complex and still largely unexplored environment but its usage actually embraces any eukaryotic sample. ITSoneDB reference dataset has been also recently made public in the ENA Browser under accession PRJEB33030. Among the pipelines available to date in the WorkBench for sequence-based taxonomic assignment, users can choose BioMaS (Fosso ), QIIME (Caporaso ), QIIME2 (Bolyen , 2) or Mothur (Schloss ). BioMaS, already freely available as a web-service at http://recasgateway.ba.infn.it/web/guest/biomas, offers an automated workflow for the taxonomic analysis of both prokaryotic and eukaryotic HTS DNA metabarcoding data. QIIME and QIIME2 are open-source pipelines for performing microbiome biodiversity analysis through quality graphics and statistics, and MOTHUR is a comprehensive suite of tools targeted at microbial community ecology. All these tools use ITSoneDB as reference database. Easy to use interfaces, available in the workbench, permit to execute the previously mentioned pipelines in an integrated environment (see supplementary material). In addition, new services targeting some of the most common and challenging issues of metabarcoding experimental protocols, such as the design of effective universal primers and the evaluation of the barcoding gap in customized taxonomic ranges, are directly connected to ITSoneDB and accessible through WorkBench easy-to-use interfaces. The primers design tool aims at supporting researchers in designing successful ‘universal’ primer pairs able to amplify ITS1 in wide groups of organisms virtually avoiding, at the same time, any off-target amplification. This is still a tricky and crucial issue, since the use of ITS1 as taxonomic marker has gained popularity only recently (Badotti ; Usyk ; Wang ) and the already available primers pairs are often limited by taxonomic bias and able to generate only a low number of sequence reads, insufficient to encompass the global complexity of communities (Usyk ). We aimed to improve the primers inference by using the high-coverage ITSoneDB collection with Mopo16S (Sambo ), a recently developed primer inference tool. ITSoneWB allows to apply a modified version of Mopo16S to a set of ITS1 sequences extracted according to specific users' requests (e.g. a customized taxonomic target). The barcoding gap estimation gives a prior idea of the ability of a specific genomic region to discriminate between taxa (Eckert ). The value of the barcoding gap, usually referred to the divergence between intra- and inter-specific sequence variability for congeneric DNA barcode sequences, strongly depends on taxonomic group and analytical practices (Čandek and Kuntner, 2015). Due to the important role of this parameter in predicting the experiment success, we developed and implemented in the WorkBench a new barcoding gap inference tool working on ITSoneDB collection. It can be applied among species belonging to the same genus or, at a higher taxonomic level, among genera belonging to the same family in customized taxonomic ranges. Moreover, the user is provided with a facility, the ITSoneDB connector, to query, cross-referencing and downloading ITS1 data and metadata in case he wants to feed his own bioinformatic workflow. Finally, in order to guarantee full interoperability with other Workflow Management Systems, we deployed a Dockerized version of ITSoneWB tools. Nonetheless, also a Dockerized version of the whole Galaxy environment is available. A complete documentation for both ITSoneWB and ad hoc developed tools installation and configuration is also available (itsonewb.readthedocs.io).

3 Conclusion

ITSoneWB is a new bioinformatic environment aimed at profiling community biodiversity based on ITS1, an increasingly popular DNA barcode in Eukaryotes. DNA metabarcoding established pipelines and new facilities are here oriented to ITSoneDB that hosts, in our knowledge, the first and unique specialized collection of well-controlled and taxonomically annotated ITS1 sequences embracing the entire Eukaryotic domain. The WorkBench is freely available and easy to use even by non-expert, and the executed analyses are easily reproducible in order to promote the data use and reuse according to the FAIR guidelines (Wilkinson ). Its virtual instance has been deployed on the ReCaS-Bari cloud facility thus supplying enough computational power and suitable scalability of the underlying resources in order to support large projects and/or to include new tools (Tangaro ). Our next plan is to complete and increasingly enrich this virtual research environment by extending its application to additional eukaryotic taxonomic markers, allowing to use them individually or in combination, and enhancing the suite of accessory tools. Click here for additional data file.
  14 in total

1.  Introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities.

Authors:  Patrick D Schloss; Sarah L Westcott; Thomas Ryabin; Justine R Hall; Martin Hartmann; Emily B Hollister; Ryan A Lesniewski; Brian B Oakley; Donovan H Parks; Courtney J Robinson; Jason W Sahl; Blaz Stres; Gerhard G Thallinger; David J Van Horn; Carolyn F Weber
Journal:  Appl Environ Microbiol       Date:  2009-10-02       Impact factor: 4.792

2.  DNA barcoding gap: reliable species identification over morphological and geographical scales.

Authors:  Klemen Čandek; Matjaž Kuntner
Journal:  Mol Ecol Resour       Date:  2014-08-06       Impact factor: 7.090

3.  ITS1: a DNA barcode better than ITS2 in eukaryotes?

Authors:  Xin-Cun Wang; Chang Liu; Liang Huang; Johan Bengtsson-Palme; Haimei Chen; Jian-Hui Zhang; Dayong Cai; Jian-Qin Li
Journal:  Mol Ecol Resour       Date:  2014-09-24       Impact factor: 7.090

4.  BioMaS: a modular pipeline for Bioinformatic analysis of Metagenomic AmpliconS.

Authors:  Bruno Fosso; Monica Santamaria; Marinella Marzano; Daniel Alonso-Alemany; Gabriel Valiente; Giacinto Donvito; Alfonso Monaco; Pasquale Notarangelo; Graziano Pesole
Journal:  BMC Bioinformatics       Date:  2015-07-01       Impact factor: 3.169

5.  Does a barcoding gap exist in prokaryotes? Evidences from species delimitation in cyanobacteria.

Authors:  Ester M Eckert; Diego Fontaneto; Manuela Coci; Cristiana Callieri
Journal:  Life (Basel)       Date:  2014-12-31

6.  ITSoneDB: a comprehensive collection of eukaryotic ribosomal RNA Internal Transcribed Spacer 1 (ITS1) sequences.

Authors:  Monica Santamaria; Bruno Fosso; Flavio Licciulli; Bachir Balech; Ilaria Larini; Giorgio Grillo; Giorgio De Caro; Sabino Liuni; Graziano Pesole
Journal:  Nucleic Acids Res       Date:  2018-01-04       Impact factor: 16.971

7.  Effectiveness of ITS and sub-regions as DNA barcode markers for the identification of Basidiomycota (Fungi).

Authors:  Fernanda Badotti; Francislon Silva de Oliveira; Cleverson Fernando Garcia; Aline Bruna Martins Vaz; Paula Luize Camargos Fonseca; Laila Alves Nahum; Guilherme Oliveira; Aristóteles Góes-Neto
Journal:  BMC Microbiol       Date:  2017-02-23       Impact factor: 3.605

8.  Optimizing PCR primers targeting the bacterial 16S ribosomal RNA gene.

Authors:  Francesco Sambo; Francesca Finotello; Enrico Lavezzo; Giacomo Baruzzo; Giulia Masi; Elektra Peta; Marco Falda; Stefano Toppo; Luisa Barzon; Barbara Di Camillo
Journal:  BMC Bioinformatics       Date:  2018-09-29       Impact factor: 3.169

9.  Laniakea: an open solution to provide Galaxy "on-demand" instances over heterogeneous cloud infrastructures.

Authors:  Marco Antonio Tangaro; Giacinto Donvito; Marica Antonacci; Matteo Chiara; Pietro Mandreoli; Graziano Pesole; Federico Zambelli
Journal:  Gigascience       Date:  2020-04-01       Impact factor: 6.524

10.  The FAIR Guiding Principles for scientific data management and stewardship.

Authors:  Mark D Wilkinson; Michel Dumontier; I Jsbrand Jan Aalbersberg; Gabrielle Appleton; Myles Axton; Arie Baak; Niklas Blomberg; Jan-Willem Boiten; Luiz Bonino da Silva Santos; Philip E Bourne; Jildau Bouwman; Anthony J Brookes; Tim Clark; Mercè Crosas; Ingrid Dillo; Olivier Dumon; Scott Edmunds; Chris T Evelo; Richard Finkers; Alejandra Gonzalez-Beltran; Alasdair J G Gray; Paul Groth; Carole Goble; Jeffrey S Grethe; Jaap Heringa; Peter A C 't Hoen; Rob Hooft; Tobias Kuhn; Ruben Kok; Joost Kok; Scott J Lusher; Maryann E Martone; Albert Mons; Abel L Packer; Bengt Persson; Philippe Rocca-Serra; Marco Roos; Rene van Schaik; Susanna-Assunta Sansone; Erik Schultes; Thierry Sengstag; Ted Slater; George Strawn; Morris A Swertz; Mark Thompson; Johan van der Lei; Erik van Mulligen; Jan Velterop; Andra Waagmeester; Peter Wittenburg; Katherine Wolstencroft; Jun Zhao; Barend Mons
Journal:  Sci Data       Date:  2016-03-15       Impact factor: 6.444

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.