| Literature DB >> 22238640 |
John Wieczorek1, David Bloom, Robert Guralnick, Stan Blum, Markus Döring, Renato Giovanni, Tim Robertson, David Vieglais.
Abstract
Biodiversity data derive from myriad sources stored in various formats on many distinct hardware and software platforms. An essential step towards understanding global patterns of biodiversity is to provide a standardized view of these heterogeneous data sources to improve interoperability. Fundamental to this advance are definitions of common terms. This paper describes the evolution and development of Darwin Core, a data standard for publishing and integrating biodiversity information. We focus on the categories of terms that define the standard, differences between simple and relational Darwin Core, how the standard has been implemented, and the community processes that are essential for maintenance and growth of the standard. We present case-study extensions of the Darwin Core into new research communities, including metagenomics and genetic resources. We close by showing how Darwin Core records are integrated to create new knowledge products documenting species distributions and changes due to environmental perturbations.Entities:
Mesh:
Year: 2012 PMID: 22238640 PMCID: PMC3253084 DOI: 10.1371/journal.pone.0029715
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Scope of Darwin Core: The Standard, deriving from previous standards work (e.g., Dublin Core), describes core sets (e.g., organismal, taxonomic) of characteristics of biodiversity, which are applicable in many biological domains (e.g., Paleontology, Botany).
The standard can be extended to cover details of specific sub-disciplines (e.g., Genetic Resources, Herbaria, Taxonomic Checklists). Collaborations with other standards organizations (Genomics Standards Consortium (GSC) extend Darwin Core for new disciplines (Genomics, Metagenomics, Gene Marker Sequences.
Figure 2Darwin Core Categories: Simple Darwin Core is comprised of seven categories of terms (green).
This subset of Darwin Core terms represents descriptive data about organisms that can be represented in one file with one row per record and one column per term. Two additional categories (orange) expand Darwin Core with concepts that require a more complex data structure, such as multiple measurements from a single specimen, and cannot be represented easily in Simple Darwin Core.