| Literature DB >> 36173960 |
Elisha M Wood-Charlson1, Zachary Crockett2, Chris Erdmann3, Adam P Arkin1, Carly B Robinson4.
Abstract
Entities:
Year: 2022 PMID: 36173960 PMCID: PMC9521804 DOI: 10.1371/journal.pcbi.1010476
Source DB: PubMed Journal: PLoS Comput Biol ISSN: 1553-734X Impact factor: 4.779
Fig 1Examples of different types of metadata needed to describe the conversion of physical environmental samples into data and results.
Submitting data to central repositories typically requires sample and preparation metadata. Data processing and feature metadata are generated during analysis. Credit: Luke Thompson, PhD (National Oceanic and Atmospheric Administration). Source: the National Microbiome Data Collaborative [22]).
Examples of metadata that support data management from sample to publication, and resources to help standardize data/ metadata and sharing (protocols, controlled vocabularies/ontologies, etc.).
| Data management stage | Metadata fields | Standardize public resources |
|---|---|---|
|
| Latitude, longitude, date/time, temperature, biome/ecosystem, depth and/or elevation of sampling site, etc. | Environmental Ontology (ENVO), Minimum Information about x Sequence (MIxS), International Geo/General Sample Number (IGSN) |
|
| Laboratory protocol(s): DNA extraction, purification, amplification. | Protocols.io, e-laboratory notebook/management software |
|
| Software tools for QA/QC, assembly, annotation. Include reference (if published), version, and parameters used. | Community guidelines for describing and citing software [ |
|
| E.g., Annotations of sequence data, such as taxonomy or function | NCBI Taxonomy, Genome Taxonomy Database toolkit (GTDB-tk); Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG), etc. |
|
| Data owner(s), organization, keywords | ORCID, Researcher Organization Registry (ROR); keyword selection [ |
|
| Usage license, privacy protocols, transfer protocols | Creative Commons, HTTP |
|
| Type and size of data, file formats, etc. | .csv,.tsv, etc. |
|
| See data processing. | Workflow notebooks (e.g., [ |
Common PIDs and possible relationships between them.
| PID | Identifies… | Relationship between PIDs and more information |
|---|---|---|
| ORCID iD [ | People doing the science | Example contribution roles via Contributor Roles Taxonomy [ |
| DOI | Digital objects: DMPs, data, software, publications, proposals, protocols | DataCite Schema [ |
| ROR ID [ | Research organization where science happens | See Research Organization Registry (ROR) to search for organizations |
| IGSN [ | Physical samples collected and processed to generate data | See International GeoSample Number (IGSN); allows for parent–child relationship between samples; partnered with DataCite to track relationships [ |
| RRID [ | Resources (e.g., antibodies, model organisms, and software projects) used in the biomedical field | See Research Resource Identifiers (RRIDs); provides citation recommendations for use within publication text, often in the methods section |
| RAiD [ | Collection of PIDs generated by a research project | See Research Activity Identifier (RAiD) for more details |