| Literature DB >> 28365735 |
L Suhrbier1, W-H Kusber1, O Tschöpe1, A Güntsch1, W G Berendsohn1.
Abstract
Biological research collections holding billions of specimens world-wide provide the most important baseline information for systematic biodiversity research. Increasingly, specimen data records become available in virtual herbaria and data portals. The traditional (physical) annotation procedure fails here, so that an important pathway of research documentation and data quality control is broken. In order to create an online annotation system, we analysed, modeled and adapted traditional specimen annotation workflows. The AnnoSys system accesses collection data from either conventional web resources or the Biological Collection Access Service (BioCASe) and accepts XML-based data standards like ABCD or DarwinCore. It comprises a searchable annotation data repository, a user interface, and a subscription based message system. We describe the main components of AnnoSys and its current and planned interoperability with biodiversity data portals and networks. Details are given on the underlying architectural model, which implements the W3C OpenAnnotation model and allows the adaptation of AnnoSys to different problem domains. Advantages and disadvantages of different digital annotation and feedback approaches are discussed. For the biodiversity domain, AnnoSys proposes best practice procedures for digital annotations of complex records. Database URL: https://annosys.bgbm.fu-berlin.de/AnnoSys/AnnoSys.Entities:
Mesh:
Year: 2017 PMID: 28365735 PMCID: PMC5502362 DOI: 10.1093/database/bax018
Source DB: PubMed Journal: Database (Oxford) ISSN: 1758-0463 Impact factor: 3.451
Figure 1.Example of traditional annotations on a herbarium specimen collected in the early 19th century: images of the same herbarium specimen (B 10 0242372) taken in the 1930s (left, identified as Guatteria poiteaui) and in 2006 (right, identified as Cremastosperma brevipes). This demonstrates the potential disconnect between the virtual specimen image (or record) and the actual object; in the future we expect that the virtual specimen will increasingly become the prime object of annotations, which can be accessed and managed with AnnoSys. (A) Leaf mounted on the herbarium sheet; (B) fruit (dissected on the right); (C) original herbarium label; (D) handwritten early annotation including additional morphological details (from duplicates?); (E) label indicating the type (name-giving) status of the specimen; (F) property-indicating stamp of the Berlin herbarium (cut off on the left); (G, H) ephemeral photographic negative number and scale bar; (I) permanent barcode label (UUID); (J) permanent scale bar; (K) ephemeral colour chart; (L) stamp indicating digitisation; (M) (handwriting): internal documentation of a loan; (N) annotation label as of 1938; (O) annotation as of 1998. Source (left image): The Field Museum of Natural History (2014). J. F. Macbride's Historical Photographs (1929–39) of Type Specimens from Berlin (B) (CC BY-NC 4.0); (right image): Röpert D. (ed.) 2000 + (continuously updated): Digital specimen images at the Herbarium Berolinense.—Stable identifier: http://herbarium.bgbm.org/object/B100242372 (CC BY-SA 3.0) (accessed 9 June 2016).
Figure 2.Simplified AnnoSys system workflow. Users access annotations via biodiversity data portals or directly via the AnnoSys user interface. Annotations are publicly visible in the data portals directly after publication.
Biodiversity portals integrating AnnoSys.
| Name | URL | Scope |
|---|---|---|
| GBIF |
| Worldwide biodiversity data |
| GGBN |
| Worldwide DNA and tissue samples and related biodiversity data |
| EDIT Specimen and Observation Explorer for Taxonomists |
| Worldwide biodiversity data |
| World Flora Online Specimen Explorer for Phytotaxonomists |
| Worldwide botanical specimen data |
| BioCASE, Biological Collection Service for Europe |
| Biodiversity data of Europe |
| BiNHum Sammlungsportal des Humboldt-Rings |
| Portal of Collections of institutions of the Humboldt Ring, Germany |
| VH/de German Virtual Herbarium |
| Portal of German Herbaria in Germany |
| GBIF Deutschland Botanik |
| Biodiversity data of Germany, AnnoSys implemented |
| GBIF.DE Algae and Protists |
| Worldwide biodiversity data of algae and protists |
| Herbarium Berolinense—Virtual Herbarium |
| Digitised herbarium data at B |
| BIOCASE portal for BGBM collection |
| Specimen data of all collections of the BGBM |
| JACQ |
| Herbarium specimen management system, used by 30 institutions |
Currently only for ABCD records.
Implemented filter types for specimen records in queries and for notifications.
| Topic | Content |
|---|---|
| Species | Identification: scientific name of the species, comprising genus name and species epithet |
| Genus | The genus part of the scientific name |
| Family | Name of the family the scientific name is assigned to |
| Collector name | Name of a person or a team who collected the specimen |
| Collector’s number | (Field) number given by collector to the specimen |
| Country | Name of the country where the specimen was collected |
| Institution code | Publishing institution |
| Collection code | (Sub-) collection within the institution |
| Catalogue number | Specimen’s identifier |
| Identified by | Name of person who identified the specimen |
| Annotator | Annotator’s name |
Basic Annotation Types.
| Annotation type | Content | Elements involved |
|---|---|---|
| Determination | Elements used when an organism is identified or its identification is revised | Full scientific name, identification made by, identification date, reference URI etc. |
| Gathering | Elements used to describe the collection event and the locality where a specimen has been collected | Collector, collector’s field number, locality, country, latitude, longitude, altitude, date etc. |
| Nomenclatural type | Elements referring to the name bearing specimen of an organism | Type status, full scientific name, person assigning the type status, reference, URI etc. |
| Scientific name | Elements to orthographically correct a given scientific name | Full scientific name, genus, first epithet, infraspecific epithet, author, higher taxon name etc. |
Figure 3.AnnoSys workflow. White arrows: user actions; dark gray arrows: data flow; light gray arrows: mailing system.
Figure 4.Overview of the AnnoSys system architecture.
Figure 5.Data enrichment within the AnnoSys workflow.
Components of tripleId, LSID and AnnoSys Identifier.
| Name | Content | TripleId | LSID | AnnoSysId |
|---|---|---|---|---|
| Institution identifier | Institution code | yes | yes | yes |
| Collection identifier | Collection code | yes | yes | yes |
| Unit identifier | Catalogue number | yes | yes | yes |
| Version | Revision | no | Yes | yes |
| Format | Namespace prefix | no | no | yes |
Figure 6.Annotation data model of the current AnnoSys application.
Meta annotation types.
| Annotation type | Content | Elements involved |
|---|---|---|
| Curatorial | Meta annotation referring annotations commented on by a curator. Each annotated value can be accepted, rejected and discussed | All elements of the curated annotation |
| Batch | Meta annotation linking identical annotations referring to different annotated records | All elements of the linked annotation(s) |
| CuratorialBatch | Meta annotation referring to all curated annotations of a Batch and its annotation type. Each annotated value can be accepted, rejected or commented by the curator for all annotations linked by the referred Batch | All elements of the curated BatchAnnotations |
Supported expectations.
| Expectation | Description |
|---|---|
| Add | Add new element according to the element selector to annotated data record |
| Remove | Remove element from annotated data record |
| Update | Update element in annotated data record |
Supported decisions in curatorial annotations.
| Expectation | Description |
|---|---|
| Accepted | Element accepted by curator and updated in collection database |
| Rejected | Element rejected by curator and not updated in collection database |
| Undecided | Further processing of element in collection database not yet decided |
| Update | Element already updated for another reason |
Figure 7.Curatorial annotation data model.
Figure 8.Batch annotation data model.
Figure 9.Open Annotation implementation of the AnnoSys Annotation Data Model.
Direct mappings from AnnoSys' annotation metadata into the updated OA Data Model.
| AnnoSys data model | OA part | OA property |
|---|---|---|
| Motivation | Provenance | oa:motivatedBy |
| Datetime (of the creation) | Provenance | oa:annotatedAt |
| Annotation GUID | Provenance | annotation resource uri |
| Annotator | Provenance | oa:annotatedBy |
| Record GUID | Target | oa:hasSource |
| Record document format | Target | oa:cachedSource (from oa:hasState) |
| Record document version | Target | oa:when (from oa:hasState) |
Direct mappings from AnnoSys annotated elements to the updated OA Data Model.
| AnnoSys data model | OA part | OA property |
|---|---|---|
| Element selector | Target | oa:hasSelector |
| Expectation | Body | rdf:type |
| Annotated value | Body | rdf:value |
| Comment | Body | dcterms:description |
Figure 10.W3C Open Annotation implementation of AnnoSys Curatorial Annotations.
Direct mappings from AnnoSys' curated elements into OA's Data Model.
| AnnoSys data model | W3C OA part | W3C OA property |
|---|---|---|
| Annotated element | Specific body | oa:hasScope |
| Curational processing | Body | decision:hasResult |
| Comment | Body | dcterms:description |
Figure 11.W3C Open Annotation implementation of AnnoSys Batch Annotations.