Literature DB >> 25470125

The trouble with triplets in biodiversity informatics: a data-driven case against current identifier practices.

Robert Guralnick1, Tom Conlin2, John Deck3, Brian J Stucky4, Nico Cellinese5.   

Abstract

The biodiversity informatics community has discussed aspirations and approaches for assigning globally unique identifiers (GUIDs) to biocollections for nearly a decade. During that time, and despite misgivings, the de facto standard identifier has become the "Darwin Core Triplet", which is a concatenation of values for institution code, collection code, and catalog number associated with biocollections material. Our aim is not to rehash the challenging discussions regarding which GUID system in theory best supports the biodiversity informatics use case of discovering and linking digital data across the Internet, but how well we can link those data together at this moment, utilizing the current identifier schemes that have already been deployed. We gathered Darwin Core Triplets from a subset of VertNet records, along with vertebrate records from GenBank and the Barcode of Life Data System, in order to determine how Darwin Core Triplets are deployed "in the wild". We asked if those triplets follow the recommended structure and whether they provide an easy and unambiguous means to track from specimen records to genetic sequence records. We show that Darwin Core Triplets are often riddled with semantic and syntactic errors when deployed and curated in practice, despite specifications about how to construct them. Our results strongly suggest that Darwin Core Triplets that have not been carefully curated are not currently serving a useful role for relinking data. We briefly consider needed next steps to overcome current limitations.

Entities:  

Mesh:

Year:  2014        PMID: 25470125      PMCID: PMC4254916          DOI: 10.1371/journal.pone.0114069

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


  10 in total

1.  Globally distributed object identification for biological knowledgebases.

Authors:  Tim Clark; Sean Martin; Ted Liefeld
Journal:  Brief Bioinform       Date:  2004-03       Impact factor: 11.622

Review 2.  Biodiversity informatics: the challenge of linking data and the role of shared identifiers.

Authors:  Roderic D M Page
Journal:  Brief Bioinform       Date:  2008-04-29       Impact factor: 11.622

Review 3.  Towards a data publishing framework for primary biodiversity data: challenges and potentials for the biodiversity informatics community.

Authors:  Vishwas S Chavan; Peter Ingwersen
Journal:  BMC Bioinformatics       Date:  2009-11-10       Impact factor: 3.169

4.  VertNet: a new model for biodiversity data sharing.

Authors:  Heather Constable; Robert Guralnick; John Wieczorek; Carol Spencer; A Townsend Peterson
Journal:  PLoS Biol       Date:  2010-02-16       Impact factor: 8.029

5.  Darwin Core: an evolving community-developed biodiversity data standard.

Authors:  John Wieczorek; David Bloom; Robert Guralnick; Stan Blum; Markus Döring; Renato Giovanni; Tim Robertson; David Vieglais
Journal:  PLoS One       Date:  2012-01-06       Impact factor: 3.240

6.  bioGUID: resolving, discovering, and minting identifiers for biodiversity informatics.

Authors:  Roderic D M Page
Journal:  BMC Bioinformatics       Date:  2009-11-10       Impact factor: 3.169

7.  bold: The Barcode of Life Data System (http://www.barcodinglife.org).

Authors:  Sujeevan Ratnasingham; Paul D N Hebert
Journal:  Mol Ecol Notes       Date:  2007-05-01

8.  Semantics in support of biodiversity knowledge discovery: an introduction to the biological collections ontology and related ontologies.

Authors:  Ramona L Walls; John Deck; Robert Guralnick; Steve Baskauf; Reed Beaman; Stanley Blum; Shawn Bowers; Pier Luigi Buttigieg; Neil Davies; Dag Endresen; Maria Alejandra Gandolfo; Robert Hanner; Alyssa Janning; Leonard Krishtalka; Andréa Matsunaga; Peter Midford; Norman Morrison; Éamonn Ó Tuama; Mark Schildhauer; Barry Smith; Brian J Stucky; Andrea Thomer; John Wieczorek; Jamie Whitacre; John Wooley
Journal:  PLoS One       Date:  2014-03-03       Impact factor: 3.240

9.  The BiSciCol Triplifier: bringing biodiversity data to the Semantic Web.

Authors:  Brian J Stucky; John Deck; Tom Conlin; Lukasz Ziemba; Nico Cellinese; Robert Guralnick
Journal:  BMC Bioinformatics       Date:  2014-07-29       Impact factor: 3.169

10.  The GBIF integrated publishing toolkit: facilitating the efficient publishing of biodiversity data on the internet.

Authors:  Tim Robertson; Markus Döring; Robert Guralnick; David Bloom; John Wieczorek; Kyle Braak; Javier Otegui; Laura Russell; Peter Desmet
Journal:  PLoS One       Date:  2014-08-06       Impact factor: 3.240

  10 in total
  6 in total

1.  Community next steps for making globally unique identifiers work for biocollections data.

Authors:  Robert P Guralnick; Nico Cellinese; John Deck; Richard L Pyle; John Kunze; Lyubomir Penev; Ramona Walls; Gregor Hagedorn; Donat Agosti; John Wieczorek; Terry Catapano; Roderic D M Page
Journal:  Zookeys       Date:  2015-04-06       Impact factor: 1.546

2.  Integrating and visualizing primary data from prospective and legacy taxonomic literature.

Authors:  Jeremy A Miller; Donat Agosti; Lyubomir Penev; Guido Sautter; Teodor Georgiev; Terry Catapano; David Patterson; David King; Serrano Pereira; Rutger Aldo Vos; Soraya Sierra
Journal:  Biodivers Data J       Date:  2015-05-12

Review 3.  DNA barcoding and taxonomy: dark taxa and dark texts.

Authors:  Roderic D M Page
Journal:  Philos Trans R Soc Lond B Biol Sci       Date:  2016-09-05       Impact factor: 6.237

Review 4.  Use of globally unique identifiers (GUIDs) to link herbarium specimen records to physical specimens.

Authors:  Gil Nelson; Patrick Sweeney; Edward Gilbert
Journal:  Appl Plant Sci       Date:  2018-03-07       Impact factor: 1.936

5.  The NCBI BioCollections Database.

Authors:  Shobha Sharma; Stacy Ciufo; Elena Starchenko; Dakshesh Darji; Larry Chlumsky; Ilene Karsch-Mizrachi; Conrad L Schoch
Journal:  Database (Oxford)       Date:  2018-01-01       Impact factor: 3.451

6.  The Global Registry of Biodiversity Repositories: A Call for Community Curation.

Authors:  David E Schindel; Scott E Miller; Michael G Trizna; Eileen Graham; Adele E Crane
Journal:  Biodivers Data J       Date:  2016-08-26
  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.