| Literature DB >> 26236474 |
Leho Tedersoo1, Kelly S Ramirez2, R Henrik Nilsson3, Aivi Kaljuvee4, Urmas Kõljalg4, Kessy Abarenkov1.
Abstract
High-throughput sequencing-based metabarcoding studies produce vast amounts of ecological data, but a lack of consensus on standardization of metadata and how to refer to the species recovered severely hampers reanalysis and comparisons among studies. Here we propose an automated workflow covering data submission, compression, storage and public access to allow easy data retrieval and inter-study communication. Such standardized and readily accessible datasets facilitate data management, taxonomic comparisons and compilation of global metastudies.Entities:
Keywords: Data storage; Digital object identifiers (DOI); Environmental metadata; High-throughput sequencing (HTS); Interactive database; Internal transcribed spacer (ITS); Next-generation sequencing; Species hypotheses
Mesh:
Year: 2015 PMID: 26236474 PMCID: PMC4521374 DOI: 10.1186/s13742-015-0074-5
Source DB: PubMed Journal: Gigascience ISSN: 2047-217X Impact factor: 6.524
Fig. 1General data structure. a Suggested workflow using various bioinformatics tools and databases. DOI, digital object identifier; HTS, high-throughput sequencing; INSDc, International Nucleotide Sequence Database Collaboration; repres, representative; seq, sequencing. b Proposed minimum data fields for HTS metadata
Fig. 2Screenshot of PlutoF workbench [11] for managing species hypotheses in the UNITE database [https://unite.ut.ee]. Multiple alignment of one of 20 clades of the enigmatic fungal class Archaeorhizomycetes is shown. Species hypotheses (SH) based on 97.0-100.0 % sequence similarity thresholds are marked with color patterns. The representative sequence of each SH is shown in green text. User-annotated taxonomic and ecological metadata are also indicated