| Literature DB >> 26421256 |
Gil Nelson1, Patrick Sweeney2, Lisa E Wallace3, Richard K Rabeler4, Dorothy Allard5, Herrick Brown6, J Richard Carter7, Michael W Denslow8, Elizabeth R Ellwood9, Charlotte C Germain-Aubrey10, Ed Gilbert11, Emily Gillespie12, Leslie R Goertzen13, Ben Legler14, D Blaine Marchant15, Travis D Marsico16, Ashley B Morris17, Zack Murrell8, Mare Nazaire18, Chris Neefus19, Shanna Oberreiter20, Deborah Paul1, Brad R Ruhfel21, Thomas Sasek22, Joey Shaw23, Pamela S Soltis10, Kimberly Watson24, Andrea Weeks25, Austin R Mast9.
Abstract
Effective workflows are essential components in the digitization of biodiversity specimen collections. To date, no comprehensive, community-vetted workflows have been published for digitizing flat sheets and packets of plants, algae, and fungi, even though latest estimates suggest that only 33% of herbarium specimens have been digitally transcribed, 54% of herbaria use a specimen database, and 24% are imaging specimens. In 2012, iDigBio, the U.S. National Science Foundation's (NSF) coordinating center and national resource for the digitization of public, nonfederal U.S. collections, launched several working groups to address this deficiency. Here, we report the development of 14 workflow modules with 7-36 tasks each. These workflows represent the combined work of approximately 35 curators, directors, and collections managers representing more than 30 herbaria, including 15 NSF-supported plant-related Thematic Collections Networks and collaboratives. The workflows are provided for download as Portable Document Format (PDF) and Microsoft Word files. Customization of these workflows for specific institutional implementation is encouraged.Entities:
Keywords: citizen science; digital imaging; digitization; herbarium; specimen database; workflow
Year: 2015 PMID: 26421256 PMCID: PMC4578381 DOI: 10.3732/apps.1500065
Source DB: PubMed Journal: Appl Plant Sci ISSN: 2168-0450 Impact factor: 1.936
Fig. 1.Example object-to-data-to-image workflow. This workflow captures data directly from labels on physical specimens. Images of specimens may or may not be captured. Barcodes are usually applied inline or as an iterative step through which dozens or hundreds of barcodes are affixed, immediately preceding data entry. Pre-digitization curation, including nomenclatural annotations and specimen organization, is usually important in this workflow. The need for specimen conservation may be discovered and remedied as physical specimens are passed to data entry technicians or following the specimen handling associated with imaging procedures.
Fig. 2.Example object-to-image-to-data workflow. This workflow captures specimen images and uses these images as the basis for data capture. Barcodes are sometimes applied inline as the step immediately previous to imaging (shown optionally) and other times through an iterative process during which several dozen or several hundred barcodes are applied. Nomenclatural annotation during pre-digitization ensures synchronization of name-on-folder with name-on-specimen. The need for specimen conservation may be discovered and remedied before or after imaging.
| Module 1: Pre-digitization Curation1 |
| Module 2: Selecting Components for an Imaging Station |
| Module 3: Imaging Station Setup, Camera/Copy Stand1 |
| Module 4: Imaging Station Setup, Light Box |
| Module 5: Imaging Station Setup, Scanner1 |
| Module 6: Imaging1 |
| Module 7: Image Processing1 |
| Module 8: Organizing and Implementing a Public Participation Imaging Blitz |
| Module 9: Image Archiving |
| Module 10: Selecting a Database |
| Module 11: Data Capture1 |
| Module 12: Organizing and Implementing a Public Participation Transcription Blitz |
| Module 13: Georeferencing |
| Module 14: Proactive Digitization |
1Modules initially completed by the DROID Flat Sheets and Packets Working Group.
| Plants, Herbivores, and Parasitoids: A Model System for the Study of Tri-Trophic Associations1 |
| North American Lichens and Bryophytes: Sensitive Indicators of Environmental Quality and Change1 |
| Mobilizing New England Vascular Plant Specimen Data to Track Environmental Change1 |
| The Macrofungi Collection Consortium: Unlocking a Biodiversity Resource for Understanding Biotic Interactions, Nutrient Cycling and Human Affairs1 |
| The Macroalgal Herbarium Consortium: Accessing 150 Years of Specimen Data to Understand Changes in the Marine/Aquatic Environment1 |
| Documenting the Occurrence through Space and Time of Aquatic Non-indigenous Fish, Mollusks, Algae, and Plants Threatening North America’s Great Lakes1 |
| The Key to the Cabinets: Building and Sustaining a Research Database for a Global Biodiversity Hotspot (SERNEC)1 |
| SEINet: North American Virtual Flora Network |
| Consortium of California Herbaria |
| Consortium of Pacific Northwest Herbaria |
| Magnolia grandiFLORA: The Digital Herbarium for Mississippi |
| CyberFlora Louisiana |
| The GA-VSC Herbaria Collaborative: Phase I of a Statewide Consortium |
| Imaging the Tall Timbers Research Station’s Biological Research Collections |
| The Deep South Plant Specimen Imaging Project |
1A Thematic Collection Network funded by NSF’s Advancing Digitization of Biodiversity Collections Program.
| Will data be captured from physical specimens or images of specimens? |
| Will populated database records include all data recorded on the label or an abbreviated set of label data (often called skeletal records; |
| Will data be entered directly into the permanent database or into an intermediate transitory format for later uploading (e.g., spreadsheet; |
| Will sensitive data be redacted ( |
| Will georeferencing and other enrichment data be recorded concurrently with label data, in batch through processes integrated into the permanent or transitory database, or as a separate activity ( |
| Do clear instructions exist for handling entry of duplicate specimens? Can entry of the repeated information be made more efficient for duplicates held within an institution or between institutions using the same specimen data management system (e.g., Symbiota)? |
| Will optical character recognition (OCR) or voice recognition be used ( |
| Will verification history or other annotation data be recorded? |
| Are quality assurance and verification protocols in place to enhance accuracy ( |
| Are data entry technicians adequately selected and trained, and do they have at their disposal a detailed written protocol to guide decisions about how data should be parsed and entered? |
| Are procedures in place to route damaged specimens for conservation (see |
| Are procedures in place for handling misfiled specimens? |