| Literature DB >> 32626613 |
Kathleen M Pryer1, Carlo Tomasi2, Xiaohan Wang2, Emily K Meineke3, Michael D Windham1.
Abstract
PREMISE: Equisetum is a distinctive vascular plant genus with 15 extant species worldwide. Species identification is complicated by morphological plasticity and frequent hybridization events, leading to a disproportionately high number of misidentified specimens. These may be correctly identified by applying appropriate computer vision tools.Entities:
Keywords: Equisetales; deep learning; digitized herbarium specimens; ferns; horsetails; machine learning
Year: 2020 PMID: 32626613 PMCID: PMC7328651 DOI: 10.1002/aps3.11372
Source DB: PubMed Journal: Appl Plant Sci ISSN: 2168-0450 Impact factor: 1.936
Figure 1Drawings of the two sexually reproducing species of Equisetum included in this study. (A) Equisetum hyemale. (B) Equisetum laevigatum. Modified with permission from Hitchcock et al. (1969).
The number of digitized images studied for each Equisetum species (see Appendix S1), and the number of instances that were manually annotated for each of the four structural categories investigated.
| Species | No. of images studied | Strobilus | Normal stem node | Normal stem internode | Injured stem node |
|---|---|---|---|---|---|
|
| 36 | 77 | 791 | 178 | 2 |
|
| 36 | 83 | 479 | 95 | 33 |
|
| 36 | 81 | 827 | 107 | 0 |
Figure 2Examples of human versus machine‐applied annotations on digitized images of Equisetum. (A) Equisetum hyemale (http://sernecportal.org/portal/collections/individual/index.php?occid=13382583). (B) Equisetum ×ferrissii (http://sernecportal.org/portal/collections/individual/index.php?occid=11039870). (C) Equisetum laevigatum (http://sernecportal.org/portal/collections/individual/index.php?occid=207187). Green dashed boxes are human annotations. Solid boxes are detection results: a red box denotes a hyemale‐type (H) node; a blue box denotes a laevigatum‐type (L) node. (A) The detector missed four nodes and classified all the others correctly as being of the H variety, with high confidence scores of 6, or higher. (B) The detector missed five of the 21 nodes found by the human annotator and detected one spurious node (on the bottom kink of the stem on the left). It also found a node that had not been annotated (close to the strobilus on the stem on the right), but visual inspection shows that this is a genuine node that had not been flagged by the human annotator. Of the 18 nodes detected (genuine or spurious), the classifier determined six to be H nodes (red) and 12 to be L nodes (blue), all with high confidence scores (6, or higher). (C) Although the human annotator marked only 13 stem nodes, the classifier found many more, and classified all of them as L nodes, with high confidence scores (7, or higher). Visual inspection shows that these additional detections, which would be denoted as “false positives” in a standard evaluation (because they do not match human annotations), are all genuine nodes.
The number of true and detected nodes of type H (hyemale) and L (laevigatum) in Equisetum hyemale and E. laevigatum test images. Of the 272 true nodes detected in 30 test images, only six are misclassified, resulting in a 98% cross‐type accuracy.
| Detected | |||
|---|---|---|---|
|
|
| ||
|
| H node | 145 | 4 |
|
| 2 | 121 | |
Confusion matrix analogous to that shown in Table 2 when Equisetum ×ferrissii nodes are considered a class of their own. Nodes in E. laevigatum images are classified incorrectly only six times out of 96, while nodes of the hybrid species E. ×ferrissii and of its parent species E. hyemale are confused with each other in 36 out of 229 cases.
| Detected | ||||
|---|---|---|---|---|
|
|
|
| ||
|
|
| 85 | 5 | 22 |
|
| 0 | 90 | 0 | |
|
| 14 | 1 | 108 | |
True image classes (rows) versus classes predicted by a 5‐nearest‐neighbor classifier (columns) based on the average scores aH and aL for the top 10 H nodes and the top 10 L nodes detected in each test image. Out of 30 test images, 27 are classified correctly, for a classification accuracy of 90%.
| Predicted | ||||
|---|---|---|---|---|
|
|
|
| ||
|
|
| 9 | 0 | 1 |
|
| 0 | 10 | 0 | |
|
| 2 | 0 | 8 | |
True image classes versus classes predicted by a decision tree based on the average scores aH, aL, and aF for the top 10 H nodes, top 10 L nodes, and top 10 F nodes detected in each test image. Out of 30 test images, 27 are classified correctly, for a classification accuracy of 90%. This accuracy is the same as for the two‐node experiment in Table 4.
| Predicted | ||||
|---|---|---|---|---|
|
|
|
| ||
|
|
| 8 | 1 | 1 |
|
| 0 | 10 | 0 | |
|
| 1 | 0 | 9 | |