| Literature DB >> 30519423 |
T J Gooliaff1, Karen E Hodges1.
Abstract
Camera trapping and solicitation of wildlife images through citizen science have become common tools in ecological research. Such studies collect many wildlife images for which correct species classification is crucial; even low misclassification rates can result in erroneous estimation of the geographic range or habitat use of a species, potentially hindering conservation or management efforts. However, some species are difficult to tell apart, making species classification challenging-but the literature on classification agreement rates among experts remains sparse. Here, we measure agreement among experts in distinguishing between images of two similar congeneric species, bobcats (Lynx rufus) and Canada lynx (Lynx canadensis). We asked experts to classify the species in selected images to test whether the season, background habitat, time of day, and the visible features of each animal (e.g., face, legs, tail) affected agreement among experts about the species in each image. Overall, experts had moderate agreement (Fleiss' kappa = 0.64), but experts had varying levels of agreement depending on these image characteristics. Most images (71%) had ≥1 expert classification of "unknown," and many images (39%) had some experts classify the image as "bobcat" while others classified it as "lynx." Further, experts were inconsistent even with themselves, changing their classifications of numerous images when they were asked to reclassify the same images months later. These results suggest that classification of images by a single expert is unreliable for similar-looking species. Most of the images did obtain a clear majority classification from the experts, although we emphasize that even majority classifications may be incorrect. We recommend that researchers using wildlife images consult multiple species experts to increase confidence in their image classifications of similar sympatric species. Still, when the presence of a species with similar sympatrics must be conclusive, physical or genetic evidence should be required.Entities:
Keywords: Canada lynx; Lynx canadensis; Lynx rufus; bobcat; expert identification; image classification
Year: 2018 PMID: 30519423 PMCID: PMC6262731 DOI: 10.1002/ece3.4567
Source DB: PubMed Journal: Ecol Evol ISSN: 2045-7758 Impact factor: 2.912
Figure 1Images of bobcats (white circles; n = 805) and lynx (black circles; n = 807) taken during 2008–2017. These images were solicited from the public across British Columbia and here we map points based on our own classifications of the images. We also show our boundary between northern and southern BC (dotted line)
Characteristics of the 15 image categories within the six trials
| Trial | Season | Background habitat | Time | Visible features | Location provided |
|---|---|---|---|---|---|
| 1) Season | |||||
| Summer | Summer | Forest | Day | 2 of: face, legs, or tail | No |
| Winter | Winter | Forest | Day | 2 of: face, legs, or tail | No |
| 2) Background habitat | |||||
| Forest ( | Summer | Forest | Day | 2 of: face, legs, or tail | No |
| Grassland ( | Summer | Grassland | Day | 2 of: face, legs, or tail | No |
| Developed | Summer | Developed | Day | 2 of: face, legs, or tail | No |
| 3) Visible features | |||||
| Full body ( | Winter | Forest | Day | Face, legs and tail | No |
| Face only ( | Winter | Forest | Day | Face only | No |
| Face and legs ( | Winter | Forest | Day | Face and legs only | No |
| Legs and tail | Winter | Forest | Day | Legs and tail only | No |
| 4) Time | |||||
| Day ( | Winter | Forest | Day | 2 of: face, legs, or tail | No |
| Night | Winter | Forest | Night | 2 of: face, legs, or tail | No |
| 5) Location | |||||
| a) Location provided | |||||
| Northern BC ( | Summer | Forest | Day | 2 of: face, legs, or tail | Yes |
| Southern BC ( | Summer | Forest | Day | 2 of: face, legs, or tail | Yes |
| b) Location not provided | |||||
| Northern BC ( | Summer | Forest | Day | 2 of: face, legs, or tail | No |
| Southern BC ( | Summer | Forest | Day | 2 of: face, legs, or tail | No |
| 6) Consistency | |||||
Images taken between April and September, and showing no snow.
Images taken between October and March, and showing snow.
Images showing human infrastructure, such as houses, barns, or patios.
One image was mistakenly included twice in this category; responses for the second time it appeared were removed from all analyses.
Black and white images taken at night.
This trial contained the same images as the first trial, but they were randomly reordered.
Figure 2Distribution of (a) the number of experts that classified individual images as the majority classification and (b) the proportion of agreement scores among all 27 experts for individual images in all categories excluding the 40 images for which we provided locations (n = 259 images). With three classification options, the proportion of agreement had an upper bound of 1.00, indicating perfect agreement, and a lower bound of 0.31, indicating perfect disagreement
Examples of images with poor agreement among experts in their classifications (n = 27 experts). Images were cropped from original versions; thus, they do not show all of the background features observed by the experts that classified them. Images provided by: (A) Paul Morgan, (B) Amber Piva, (C) Jacqueline Brown, (D) Myrna Blake, (E) Bert Gregersen, (F) Scott MacDonald, (G) Donald Hendricks, and (H) John E. Marriott
|
|
The proportion of agreement had an upper bound of 1.00, indicating perfect agreement, and had a lower bound of 0.31, indicating perfect disagreement.
Agreement among all experts (n = 27) in their classifications of images within each category of images. All values of Fleiss’ kappa had a p‐value <0.001
| Category | No. of images | No. of images with a unanimous classification | Fleiss’ kappa (95% CI) |
|---|---|---|---|
| Season | |||
| Summer | 20 | 1 | 0.36 (0.21–0.52) |
| Winter | 20 | 6 | 0.77 (0.64–0.93) |
| Background habitat | |||
| Forest | 20 | 3 | 0.47 (0.34–0.63) |
| Grassland | 20 | 6 | 0.64 (0.51–0.78) |
| Developed | 20 | 10 | 0.66 (0.46–0.89) |
| Visible features | |||
| Face only | 20 | 6 | 0.66 (0.55–0.79) |
| Legs and tail | 19 | 3 | 0.66 (0.55–0.79) |
| Full body | 20 | 6 | 0.77 (0.62–0.98) |
| Face and legs | 20 | 8 | 0.81 (0.73–0.92) |
| Time | |||
| Day | 20 | 2 | 0.58 (0.41–0.79) |
| Night | 20 | 4 | 0.64 (0.53–0.78) |
| Combinations (all daytime) | |||
| Summer, forest, legs and tail | 15 | 0 | 0.34 (0.17–0.56) |
| Summer, developed, full body | 10 | 6 | 0.40 (0.18–0.77) |
| Summer, forest, full body | 39 | 6 | 0.47 (0.35–0.60) |
| Summer, forest, face and legs | 24 | 3 | 0.51 (0.37–0.69) |
| Summer, grassland, full body | 12 | 4 | 0.58 (0.43–0.79) |
| Winter, forest, legs and tail | 26 | 3 | 0.61 (0.50–0.75) |
| Winter, forest, face only | 20 | 6 | 0.66 (0.55–0.80) |
| Winter, forest, full body | 36 | 8 | 0.74 (0.65–0.85) |
| Winter, forest, face and legs | 35 | 14 | 0.80 (0.72–0.88) |
| Location provided | |||
| Northern BC | 20 | 3 | 0.21 (0.08–0.38) |
| Southern BC | 20 | 4 | 0.62 (0.45–0.83) |
| Total | 40 | 7 | 0.50 (0.35–0.68) |
| Location not provided | |||
| Northern BC | 20 | 2 | 0.04 (0.01–0.07) |
| Southern BC | 20 | 1 | 0.55 (0.44–0.69) |
| Total | 40 | 3 | 0.44 (0.32–0.57) |
Measures agreement among a group of classifiers; a value of 1 indicates perfect agreement, whereas a value of 0 indicates agreement that would occur by chance.
Images were pooled together from all categories excluding the 40 images for which we provided locations. Only combinations with ≥10 images are shown.
Figure 3Agreement among all experts (n = 27) in their classifications of images within each category of images. All values of Fleiss’ kappa had a p‐value <0.001. Bars represent 95% confidence intervals. Fleiss’ kappa measures agreement among a group of classifiers; a value of 1 indicates perfect agreement, whereas a value of 0 indicates agreement that would occur by chance
Figure 4Examples of how the visible features of an animal and the location of an image can affect expert classification. (a) The top two images are of the same animal but show slightly varying body parts and had different majority classifications by the experts; the same occurred for the bottom two images. We show the number of experts that classified each image as “bobcat,” “lynx,” and “unknown.” (b) Both images are of the same animal taken near Prince George, British Columbia and have the same image characteristics. The image on the left was not included in our experiment but had a 4:4 split vote between bobcat and lynx among local biologists who were asked to classify the image. We included the image on the right in our experiment without providing its location; 26 experts classified the image as “bobcat”, and one expert classified the image as “unknown”. Images provided by (from top to bottom row): BC Parks, Emre Giffin, James Gagnon
Figure 5Mean probability that the majority classification of a randomly selected subset of experts matched the majority classification of all 27 experts, calculated across all images excluding the 40 images for which we provided locations (n = 259 images). Bars represent 95% confidence intervals. Probabilities are lower for even numbers of experts because of the likelihood of drawing a split vote, which is not possible for odd numbers of experts