Literature DB >> 35462676

Wanted: Standards for FAIR taxonomic concept representations and relationships.

Beckett Sterner1, Nathan Upham1, Prashant Gupta1, Caleb Powell1, Nico M Franz1.   

Abstract

Making the most of biodiversity data requires linking observations of biological species from multiple sources both efficiently and accurately (Bisby 2000, Franz et al. 2016). Aggregating occurrence records using taxonomic names and synonyms is computationally efficient but known to experience significant limitations on accuracy when the assumption of one-to-one relationships between names and biological entities breaks down (Remsen 2016, Franz and Sterner 2018). Taxonomic treatments and checklists provide authoritative information about the correct usage of names for species, including operational representations of the meanings of those names in the form of range maps, reference genetic sequences, or diagnostic traits. They increasingly provide taxonomic intelligence in the form of precise description of the semantic relationships between different published names in the literature. Making this authoritative information Findable, Accessible, Interoperable, and Reusable (FAIR; Wilkinson et al. 2016) would be a transformative advance for biodiversity data sharing and help drive adoption and novel extensions of existing standards such as the Taxonomic Concept Schema and the OpenBiodiv Ontology (Kennedy et al. 2006, Senderov et al. 2018). We call for the greater, global Biodiversity Information Standards (TDWG) and taxonomy community to commit to extending and expanding on how FAIR applies to biodiversity data and include practical targets and criteria for the publication and digitization of taxonomic concept representations and alignments in taxonomic treatments, checklists, and backbones.

Entities:  

Keywords:  FAIR Principles; open data; taxonomic intelligence

Year:  2021        PMID: 35462676      PMCID: PMC9028594          DOI: 10.3897/biss.5.75587

Source DB:  PubMed          Journal:  Biodivers Inf Sci Stand        ISSN: 2535-0897


As a motivating case, consider the abundantly sampled North American deer mouse— Peromyscus maniculatus (Wagner 1845)—which was recently split from one continental species into five more narrowly defined forms, so that the name P. maniculatus is now only applied east of the Mississippi River (Bradley et al. 2019, Greenbaum et al. 2019). That single change instantly rendered ambiguous ~7% of North American mammal records in the Global Biodiversity Information Facility (n=242,663, downloaded 2021-06-04; GBIF.org 2021) and ⅓ of all National Ecological Observatory Network (NEON) small mammal samples (n=10,256, downloaded 2021-06-27). While this type of ambiguity is common in name-based databases when species are split, the example of P. maniculatus is particularly striking for its impact upon biological questions ranging from hantavirus surveillance in North America to studies of climate change impacts upon rodent life-history traits. Of special relevance to NEON sampling is recent evidence suggesting deer mice potentially transmit SARS-CoV-2 (Griffin et al. 2021). Automating the updating of occurrence records in such cases and others will require operational representations of taxonomic concepts—e.g., range maps, reference sequences, and diagnostic traits—that are FAIR in addition to taxonomic concept alignment information (Franz and Peet 2009). Despite steady progress, it remains difficult to find, access, and reuse authoritative information about how to apply taxonomic names even when it is already digitized. It can also be difficult to tell without manual inspection whether similar types of concept representations derived from multiple sources, such as range maps or reference sequences selected from different research articles or checklists, are in fact interoperable for a particular application. The issue is therefore different from important ongoing efforts to digitize trait information in species circumscriptions, for example, and focuses on how already digitized knowledge can best be packaged to inform human experts and artifical intelligence applications (Sterner and Franz 2017). We therefore propose developing community guidelines and criteria for as “semantic artefacts” of general relevance to linked open data and life sciences research (Le Franc et al. 2020).
  9 in total

1.  The quiet revolution: biodiversity informatics and the internet.

Authors:  F A Bisby
Journal:  Science       Date:  2000-09-29       Impact factor: 47.728

Review 2.  Standard data model representation for taxonomic information.

Authors:  J Kennedy; R Hyam; R Kukla; T Paterson
Journal:  OMICS       Date:  2006

3.  Microsatellite variation and evolution in the Peromyscus maniculatus species group.

Authors:  Scott E Chirhart; Rodney L Honeycutt; Ira F Greenbaum
Journal:  Mol Phylogenet Evol       Date:  2004-12-15       Impact factor: 4.286

4.  Two Influential Primate Classifications Logically Aligned.

Authors:  Nico M Franz; Naomi M Pier; Deeann M Reeder; Mingmin Chen; Shizhuo Yu; Parisa Kianmajd; Shawn Bowers; Bertram Ludäscher
Journal:  Syst Biol       Date:  2016-03-22       Impact factor: 15.683

5.  The use and limits of scientific names in biological informatics.

Authors:  David Remsen
Journal:  Zookeys       Date:  2016-01-07       Impact factor: 1.546

6.  To increase trust, change the social design behind aggregated biodiversity data.

Authors:  Nico M Franz; Beckett W Sterner
Journal:  Database (Oxford)       Date:  2018-01-01       Impact factor: 3.451

7.  SARS-CoV-2 infection and transmission in the North American deer mouse.

Authors:  Bryan D Griffin; Mable Chan; Nikesh Tailor; Emelissa J Mendoza; Anders Leung; Bryce M Warner; Ana T Duggan; Estella Moffat; Shihua He; Lauren Garnett; Kaylie N Tran; Logan Banadyga; Alixandra Albietz; Kevin Tierney; Jonathan Audet; Alexander Bello; Robert Vendramelli; Amrit S Boese; Lisa Fernando; L Robbin Lindsay; Claire M Jardine; Heidi Wood; Guillaume Poliquin; James E Strong; Michael Drebot; David Safronetz; Carissa Embury-Hyatt; Darwyn Kobasa
Journal:  Nat Commun       Date:  2021-06-14       Impact factor: 14.919

8.  The FAIR Guiding Principles for scientific data management and stewardship.

Authors:  Mark D Wilkinson; Michel Dumontier; I Jsbrand Jan Aalbersberg; Gabrielle Appleton; Myles Axton; Arie Baak; Niklas Blomberg; Jan-Willem Boiten; Luiz Bonino da Silva Santos; Philip E Bourne; Jildau Bouwman; Anthony J Brookes; Tim Clark; Mercè Crosas; Ingrid Dillo; Olivier Dumon; Scott Edmunds; Chris T Evelo; Richard Finkers; Alejandra Gonzalez-Beltran; Alasdair J G Gray; Paul Groth; Carole Goble; Jeffrey S Grethe; Jaap Heringa; Peter A C 't Hoen; Rob Hooft; Tobias Kuhn; Ruben Kok; Joost Kok; Scott J Lusher; Maryann E Martone; Albert Mons; Abel L Packer; Bengt Persson; Philippe Rocca-Serra; Marco Roos; Rene van Schaik; Susanna-Assunta Sansone; Erik Schultes; Thierry Sengstag; Ted Slater; George Strawn; Morris A Swertz; Mark Thompson; Johan van der Lei; Erik van Mulligen; Jan Velterop; Andra Waagmeester; Peter Wittenburg; Katherine Wolstencroft; Jun Zhao; Barend Mons
Journal:  Sci Data       Date:  2016-03-15       Impact factor: 6.444

9.  OpenBiodiv-O: ontology of the OpenBiodiv knowledge management system.

Authors:  Viktor Senderov; Kiril Simov; Nico Franz; Pavel Stoev; Terry Catapano; Donat Agosti; Guido Sautter; Robert A Morris; Lyubomir Penev
Journal:  J Biomed Semantics       Date:  2018-01-18
  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.