| Literature DB >> 32978432 |
F Andres Rivera-Quiroz1,2, Booppa Petcharad3, Jeremy A Miller4,5.
Abstract
Taxonomic literature contains information about virtually ever known species on Earth. In many cases, all that is known about a taxon is contained in this kind of literature, particularly for the most diverse and understudied groups. Taxonomic publications in the aggregate have documented a vast amount of specimen data. Among other things, these data constitute evidence of the existence of a particular taxon within a spatial and temporal context. When knowledge about a particular taxonomic group is rudimentary, investigators motivated to contribute new knowledge can use legacy records to guide them in their search for new specimens in the field. However, these legacy data are in the form of unstructured text, making it difficult to extract and analyze without a human interpreter. Here, we used a combination of semi-automatic tools to extract and categorize specimen data from taxonomic literature of one family of ground spiders (Liocranidae). We tested the application of these data on fieldwork optimization, using the relative abundance of adult specimens reported in literature as a proxy to find the best times and places for collecting the species (Teutamus politus) and its relatives (Teutamus group, TG) within Southeast Asia. Based on these analyses we decided to collect in three provinces in Thailand during the months of June and August. With our approach, we were able to collect more specimens of T. politus (188 specimens, 95 adults) than all the previous records in literature combined (102 specimens). Our approach was also effective for sampling other representatives of the TG, yielding at least one representative of every TG genus previously reported for Thailand. In total, our samples contributed 231 specimens (134 adults) to the 351 specimens previously reported in the literature for this country. Our results exemplify one application of mined literature data that allows investigators to more efficiently allocate effort and resources for the study of neglected, endangered, or interesting taxa and geographic areas. Furthermore, the integrative workflow demonstrated here shares specimen data with global online resources like Plazi and GBIF, meaning that others can freely reuse these data and contribute to them in the future. The contributions of the present study represent an increase of more than 35% on the taxonomic coverage of the TG in GBIF based on the number of species. Also, our extracted data represents 72% of the occurrences now available through GBIF for the TG and more than 85% of occurrences of T. politus. Taxonomic literature is a key source of undigitized biodiversity data for taxonomic groups that are underrepresented in the current biodiversity data sphere. Mobilizing these data is key to understanding and protecting some of the less well-known domains of biodiversity.Entities:
Mesh:
Year: 2020 PMID: 32978432 PMCID: PMC7519673 DOI: 10.1038/s41598-020-72549-8
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Maps of liocranid spiders distribution based on geographic data extracted from taxonomic literature using Plazi’s retrospective workflow (see Supplementary Table 1 for the whole set of documents used). Maps generated in RStudio[28–30]. (a) Family: Liocranidae worldwide. (b) Family Liocranidae in Southeast Asia (SEA). (c) Genus: Oedignatha. (d) Sphingius. (e) Teutamus. (f) Jacaena. (g) Koppe. (h) Sesieutes. (i) Sudaharmia. Brown shades represent family distribution and blue shades represent genus distributions. Color intensity corresponds to numbers of specimens per country.
Figure 2Distribution of the Teutamus group in Southeast Asia according to taxonomic literature (based on data extracted from 23 studies[19–23,31–48] using Plazi’s retrospective workflow). (a) Proportion of specimens reported per country, with detail of provinces in Thailand. (b) Temporal and spatial distribution of collections for the past 40 years. ● = Indonesia, ▲ = Malaysia, ⦻ = Thailand, ◆ = Philippines, ⊠ = Vietnam.
Figure 3Seasonal distribution of adult specimens of the Teutamus group in Thailand based on data extracted from 2 studies[19,21] using Plazi’s retrospective workflow. (a) Grey area indicates total number of specimens; lines detail richness per genus in literature. (b) Relative abundances of males and females of Teutamus politus. Brown shades indicate specimens in literature; blue shades indicate specimens in our study.
Records of Teutaumus group (TG) species from three Thai provinces.
| Province | Species | Spp. in literature | Spp. July–August | Spp. in our study | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
| ♂ | ♀ | Total | ♂ | ♀ | Total | ♂ | ♀ | Total | ||
| Chiang Mai | – | 4 | – | – | – | – | ||||
| 8 | 5 | – | – | 3 | – | |||||
| 3 | 3 | – | – | – | – | |||||
| 3 | 9 | – | 3 | – | – | |||||
| 6 | 5 | 2 | 2 | 1 | 1 | |||||
| 8 | 15 | 6 | 9 | 1 | 6 | |||||
| 5 | 4 | – | – | – | – | |||||
| 16 | 6 | – | – | – | – | |||||
| 17 | 3 | – | – | – | – | |||||
| – | – | – | – | 1 | – | |||||
| Krabi | – | – | – | – | 1 | 1 | ||||
| 2 | – | 2 | – | – | – | |||||
| – | 1 | – | – | – | – | |||||
| 20 | 19 | 1 | – | 5 | 14 | |||||
| 4 | 3 | – | – | – | – | |||||
| Phuket | – | – | – | – | 6 | 15 | ||||
| – | – | – | – | 2 | 1 | |||||
| 8 | 19 | 7 | 16 | 30 | 46 | |||||
| Total specimens | 100 | 96 | 18 | 30 | 50 | 84 | ||||
Total records from taxonomic literature (Spp. in literature) vs. Literature records from June–August (Spp. July–August) vs. our field samples (Spp. in our study). *indicates new geographic distribution for the species.
Teutamus group in GBIF per collection/database comparing number of occurrences, total of specimens, geographical distribution and taxonomic coverage.
Blue shaded squares indicate presence of each genus.
J Jacaena, K Koppe, O Oedignatha; Se Sesieutes, Sp Sphingius, Su Sudharmia, T Teutamus. Collection names: AM Australian Museum, Australia, CAS California academy of Sciences, USA; MACN Museo Argentino de Ciencias Naturales “Bernardino Rivadavia”, Argentina; MCZ Museum of Comparative Zoology, Harvard, USA; MNHN–P Muséum national d'Histoire naturelle-Paris, France; NBC Naturalis Biodiversity Center (formerly RMNH), The Netherlands; NMNS National Museum of Nature and Science, Japan; QM Queensland Museum, Australia; SMF Senckenberg Museum Frankfurt, Germany; SMNK Staatliches Museum für Naturkunde Karlsruhe, Germany; UMZC The University Museum of Zoology, Cambridge, UK; WAM West Australia Museum, Australia; ZMUC Zoological Museum, Natural History Museum, Denmark.
Figure 4Proportion of occurrences of the Teutamus group in GBIF[65]. Color indicates data source: digitized collection data (brown shaded) and taxonomic literature mined data (blue). Circle: Proportion per data source for the whole Teutamus group and each TG genera. Generated in RStudio[28,68]. Bars: detail of proportions and total occurrences TG (top), genus Teutamus (middle), and Teutamus politus (middle). Note the high proportion of data contributed through our mark-up and integration using Plazi’s retrospective workflow). Collection abbreviations explained in Table 2.