| Literature DB >> 29036529 |
Monica Santamaria1, Bruno Fosso1, Flavio Licciulli2, Bachir Balech1, Ilaria Larini3, Giorgio Grillo2, Giorgio De Caro2, Sabino Liuni2, Graziano Pesole1,3.
Abstract
A holistic understanding of environmental communities is the new challenge of metagenomics. Accordingly, the amplicon-based or metabarcoding approach, largely applied to investigate bacterial microbiomes, is moving to the eukaryotic world too. Indeed, the analysis of metabarcoding data may provide a comprehensive assessment of both bacterial and eukaryotic composition in a variety of environments, including human body. In this respect, whereas hypervariable regions of the 16S rRNA are the de facto standard barcode for bacteria, the Internal Transcribed Spacer 1 (ITS1) of ribosomal RNA gene cluster has shown a high potential in discriminating eukaryotes at deep taxonomic levels. As metabarcoding data analysis rely on the availability of a well-curated barcode reference resource, a comprehensive collection of ITS1 sequences supplied with robust taxonomies, is highly needed. To address this issue, we created ITSoneDB (available at http://itsonedb.cloud.ba.infn.it/) which in its current version hosts 985 240 ITS1 sequences spanning over 134 000 eukaryotic species. Each ITS1 is mapped on the NCBI reference taxonomy with its start and end positions precisely annotated. ITSoneDB has been developed in agreement to the FAIR guidelines by enabling the users to query and download its content through a simple web-interface and access relevant metadata by cross-linking to European Nucleotide Archive.Entities:
Mesh:
Substances:
Year: 2018 PMID: 29036529 PMCID: PMC5753230 DOI: 10.1093/nar/gkx855
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
ITSoneDB content statistics
| Taxon | Taxid | Total sequences | ENA annotation only | HMM annotation only | ENA and HMM | Species |
|---|---|---|---|---|---|---|
| Eukaryota | 2759 | 985 240 | 543 266 | 276 362 | 165 612 | 134 598 |
| Fungi | 4751 | 684 540 | 378 049 | 221 723 | 84 768 | 53 552 |
| Metazoa | 33 208 | 54 782 | 32 186 | 9084 | 13 512 | 9438 |
| Viridiplantae | 33 090 | 203 437 | 113 572 | 32 503 | 57 362 | 66 595 |
Figure 1.Snapshot of a ITSoneDB entry. In (A) the entry information directly extracted from ENA are reported: (i) accession number (with a hyperlink to the corresponding ENA item), (ii) version, (iii) description, (iv) sequence length (the whole sequence length, not only the ITS1), (v) taxon name (scientific name of the organism the sequence belongs, with a hyperlink to the corresponding NCBI taxonomy entry), (vi) taxon rank (taxonomic class) and lineage (full taxonomic path associated to the Taxon name). In (B) the ITS1 position annotation from ENA and HMM are reported. (C) and (D) show the alignments of the sequence with 18S and 5.8S rRNA HMM profiles respectively.