| Literature DB >> 35891075 |
Sally O A Westworth1, Carl Chalmers2, Paul Fergus2, Steven N Longmore3, Alex K Piel4, Serge A Wich1.
Abstract
Using machine learning (ML) to automate camera trap (CT) image processing is advantageous for time-sensitive applications. However, little is currently known about the factors influencing such processing. Here, we evaluate the influence of occlusion, distance, vegetation type, size class, height, subject orientation towards the CT, species, time-of-day, colour, and analyst performance on wildlife/human detection and classification in CT images from western Tanzania. Additionally, we compared the detection and classification performance of analyst and ML approaches. We obtained wildlife data through pre-existing CT images and human data using voluntary participants for CT experiments. We evaluated the analyst and ML approaches at the detection and classification level. Factors such as distance and occlusion, coupled with increased vegetation density, present the most significant effect on DP and CC. Overall, the results indicate a significantly higher detection probability (DP), 81.1%, and correct classification (CC) of 76.6% for the analyst approach when compared to ML which detected 41.1% and classified 47.5% of wildlife within CT images. However, both methods presented similar probabilities for daylight CT images, 69.4% (ML) and 71.8% (analysts), and dusk CT images, 17.6% (ML) and 16.2% (analysts), when detecting humans. Given that users carefully follow provided recommendations, we expect DP and CC to increase. In turn, the ML approach to CT image processing would be an excellent provision to support time-sensitive threat monitoring for biodiversity conservation.Entities:
Keywords: automation; biodiversity conservation; citizen scientists; poacher; wildlife
Mesh:
Year: 2022 PMID: 35891075 PMCID: PMC9319727 DOI: 10.3390/s22145386
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.847
Figure 1Examples illustrating different CT image scenarios, incorporating the factors measured in this study. Human images include (a) vegetation type (miombo woodland), time-of-day (daylight), distance (<5 m), the colour of clothing (green), orientation towards the CT (yes), height (1.9 m), and occlusion (68–100%); (b) Vegetation type (miombo woodland), time-of-day (dusk), distance (15–30 m), height (variations) and occlusion (68–100%); (c) Vegetation type (riverine forest), time-of-day (daylight), distance (10 m), the colour of clothing (green), orientation towards the CT (no), height (1.7 m), and occlusion (68–100%). Wildlife image examples include (d) Species (Philantomba monticola), vegetation type (miombo woodland), time-of-day (daylight), distance (0–4.9 m), orientation towards the CT (no), and occlusion (68–100%); (e) Species (Crocuta crocuta), vegetation type (miombo woodland), time-of-day (dusk), distance (5–9.9 m), orientation towards the CT (no), and occlusion (34–67%); (f) Species (Tragelaphus sylvaticus), vegetation type (riverine forest), time-of-day (daylight), distance (>10 m), orientation towards the CT (no), and occlusion (68–100%).
Figure A1A visual illustration of the human daylight experimental setup where participants walked each set distance marker from 5–30 m; upon reaching the end of the 30-m marker, they returned to the centre of the 30-m marker and walked directly through the middle to the camera trap (CT).
Figure A2A visual illustration of the human dusk experimental setup where participants were randomly given a set distance marker from 5–30 m to walk across to the end and return to their starting position.
The tested variables within both human and wildlife experiments, showing each variable name, their type of effect on the experimental outcome, their associated classification (dummy data) label, and each variable type.
| Variable | Effect | Classification Labels | Variable Type |
|---|---|---|---|
| Outcome Variable | |||
| Detection 1 | Target | False-negative = 1/True-positive = 2 | Binary |
| Correct Classification 1 | Target | False-negative = 1/True-positive = 2 | Binary |
| Predictor Variables | |||
| Occlusion 1 | Fixed | 0–33% = 1, 34–67% = 2, and 68–100% = 3 | Nominal |
| Distance 1 | Fixed | <5, 5, 10, 15, 20, 25, and 30 m/0–4.9 = 1, 5–9.9 = 2, and ≥10 = 3 | Nominal |
| Orientation Towards the CT 1 | Fixed | Yes = 1/No = 2 | Binary |
| Analyst Performance 1 | Fixed | Analyst 1, 2, 3 | Nominal |
| Vegetation Type 2 | Fixed | Miombo Woodland = 1/Riverine Forest = 2 | Binary |
| Colour 2 | Fixed | Blue = 1/Green = 2 | Binary |
| Height 2 | Fixed | Participant height in metres | Continuous |
| Species 3 | Fixed | Species scientific name identified numerically (1–11) | Nominal |
| Size Classes 3 | Fixed | Small = 1, Medium = 2, and Large = 3 | Nominal |
| Hierarchical Variables | |||
| Experiment Number 2 | Random | Identification number of the experiment | Nominal |
| Tree Tag Number 3 | Random | Identification number of the tree tag location | Nominal |
Notes: 1 both wildlife and human factors; 2 human factors; 3 wildlife factors.
The ML training dataset counts and the ML and analyst testing dataset counts per class for the human and wildlife experiments.
| Class Label | CT Image Count for ML Training Dataset (1000 Threshold Cap) | CT Image Count for ML and Analyst Testing Datasets |
|---|---|---|
| Person | 5196 | 11,781 |
|
| 1227 | 1014 |
|
| 1642 | 1812 |
|
| 1560 | 1443 |
|
| 1502 | 2783 |
|
| 1194 | 1102 |
|
| 1181 | 901 |
|
| 1114 | 1176 |
|
| 1009 | 1812 |
|
| 1008 | 932 |
|
| 1001 | 2107 |
|
| 1000 | 1281 |
The percentage of true-positive detections and classifications compared to true-positive detections, false-negative classifications, and false-negative detections and no classifications for machine learning (ML) (3697, 3340, 9457) and analyst (30,436, 9279, 9281) methods.
| Model | True-Positive Detection and Classification % | True-Positive Detection and False-Negative Classification % | False-Negative Detection and No Classification % |
|---|---|---|---|
| Machine Learning | 22.4 | 20.3 | 57.3 |
| Analyst | 62.1 | 18.9 | 19.0 |
The percentage of true-positive detections and classifications vs. false-negative detections and no classifications for ML daylight (8474 vs. 2590), ML dusk (19,842 vs. 16,343), analyst daylight (18,250 vs. 7172), and analyst dusk (9634 vs. 49,893) for all human models. We found no false-negative classification outcomes for human models.
| Model | True-Positive Detection and Classification % | False-Negative Detection and No Classification % |
|---|---|---|
| ML daylight | 76.5 | 23.4 |
| ML dusk | 54.8 | 45.1 |
| Analyst daylight | 71.7 | 28.2 |
| Analyst dusk | 16.1 | 83.8 |
The multi-class classification confusion matrix for ML model evaluation.
| Person |
|
|
|
|
|
|
|
|
|
|
| |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
| 9101 | 0 | 0 | 0 | 0 | 0 | 0 | 82 | 0 | 0 | 0 | 0 |
|
| 0 | 621 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 34 | 58 |
|
| 0 | 100 | 307 | 0 | 0 | 0 | 2 | 0 | 0 | 0 | 0 | 452 |
|
| 0 | 0 | 3 | 0 | 498 | 0 | 1 | 0 | 0 | 2 | 59 | 0 |
|
| 0 | 0 | 5 | 219 | 3 | 0 | 0 | 0 | 0 | 0 | 41 | 0 |
|
| 0 | 0 | 0 | 0 | 0 | 0 | 0 | 108 | 195 | 0 | 0 | 0 |
|
| 0 | 0 | 0 | 0 | 4 | 0 | 157 | 0 | 0 | 82 | 0 | 0 |
|
| 17 | 0 | 0 | 0 | 0 | 41 | 0 | 803 | 99 | 0 | 0 | 0 |
|
| 0 | 2 | 0 | 0 | 0 | 241 | 0 | 51 | 952 | 0 | 0 | 0 |
|
| 0 | 0 | 0 | 0 | 2 | 0 | 19 | 0 | 0 | 496 | 0 | 0 |
|
| 0 | 0 | 18 | 483 | 387 | 0 | 0 | 0 | 0 | 0 | 0 | 3 |
|
| 0 | 392 | 43 | 0 | 12 | 0 | 4 | 0 | 0 | 2 | 0 | 0 |
The left-hand side represents the actual class within each CT image (which species is within the tested image) and the top section represents the ML model’s predictions of the class within each CT image. The highlighted section illustrates the true-positive classifications, the unhighlighted cells in each row represent the false-positives and the unhighlighted cells in each column represent the false-negatives. True-negative classifications are all instances other than that which fall within the class of interest.
The primary performance measures of the ML model for each tested class.
| Class | F1 Score | Precision | Recall | FNR | FPR | TNR |
|---|---|---|---|---|---|---|
| Person | 99.5 | 99.8 | 99.1 | 0.8 | 0.2 | 99.7 |
|
| 67.9 | 55.7 | 86.8 | 13.1 | 7.7 | 92.2 |
|
| 49.6 | 81.4 | 35.7 | 64.3 | 0.4 | 99.5 |
|
| N/A | 0 | 0 | 100 | 4.4 | 95.5 |
|
| 0.5 | 0.3 | 1.1 | 98.8 | 5.6 | 94.3 |
|
| N/A | 0 | 0 | 100 | 1.7 | 98.2 |
|
| 73.7 | 85.8 | 64.6 | 35.3 | 0.1 | 99.8 |
|
| 80.1 | 76.9 | 83.6 | 16.3 | 1.5 | 98.4 |
|
| 76.4 | 76.4 | 76.4 | 23.5 | 1.9 | 98.0 |
|
| 90.2 | 85.2 | 95.9 | 4.0 | 0.5 | 99.4 |
|
| N/A | 0 | 0 | 100 | 0.8 | 99.1 |
|
| N/A | 0 | 0 | 100 | 3.2 | 96.7 |
All numbers are represented as percentages (%). Abbreviations include the following: N/A = not applicable, FNR = false-negative rate, FPR = false-positive rate, and TNR = true-negative rate. The F1 score is the mean between precision and recall; precision is the positive predictive value for determining the correct class and recall is how frequently the ML model recalls the correct class (also known as the true-positive rate).
Figure 2The best-fit models for the wildlife experiments presented as forest plots. All forest plots are composed of effect sizes and error bars containing 95% confidence intervals. Moreover, each factor class is compared to a baseline class as follows, distance: 0–4.9 m, occlusion: 0–33%, species: Papio cynocephalus, orientation towards the CT: no, analyst performance: Analyst 1, and time-of-day: daylight. The plots are illustrated as follows: (a) analyst detection probability (DP) model; (b) machine learning (ML) DP model; (c) ML correct classification (CC) model; (d) analyst-CC model.
Figure 3The best-fit models for the human experiments presented as forest plots. All plots are composed of effect estimates and error bars containing 95% confidence intervals. Additionally, all factor categories are compared with a reference category which are as follows: distance (daylight experiments): <5 m, distance (dusk experiments): 5 m, occlusion: 0–33%, vegetation type: miombo woodland, colour: blue, and orientation towards the CT: no. The plots are illustrated as follows: (a) analyst-daylight-DP model; (b) ML-daylight-DP model; (c) analyst-dusk-DP model; (d) ML-dusk-DP model; (e) ML-daylight-CC model; (f) ML-dusk-CC model.
The tested hypothesis summary for all factors.
| Alternative Hypothesis | Outcome |
|---|---|
| Targets orientating towards the CT would positively influence target DP in CT images. | Accepted |
| Targets orientating towards the CT would positively influence target CC in CT images. | Rejected |
| Partially occluded species-specific characteristics would negatively affect CC of similar species. | Accepted |
| The time-of-day factor (dusk) would negatively influence wildlife DP and CC. | Accepted |
| The time-of-day factor (dusk) would negatively influence human DP and CC. | Rejected |
| Colour contrast (green), in comparison to the background, would negatively impact human DP. | Accepted |
| Increasing distance and occlusion would significantly decrease target DP and CC for all wildlife models. | Accepted |
| Increasing distance and occlusion would significantly decrease target DP and CC for all human models. | Rejected |
| Dense vegetation would contribute to the significant decrease of one or more models. | Accepted |
| ML methods could perform at a statistically significant increased rate than analyst methods for at least one model. | Accepted |
| There would be a significant positive difference in analyst performance on target DP and CC. | Accepted |
| Decreases in size would decrease DP and CC, respectively, for wildlife and humans. | Rejected |