| Literature DB >> 35327863 |
Alekss Vecvanags1, Kadir Aktas2, Ilja Pavlovs2, Egils Avots2,3, Jevgenijs Filipovs1, Agris Brauns1, Gundega Done4, Dainis Jakovels1, Gholamreza Anbarjafari1,2,5,6.
Abstract
Changes in the ungulate population density in the wild has impacts on both the wildlife and human society. In order to control the ungulate population movement, monitoring systems such as camera trap networks have been implemented in a non-invasive setup. However, such systems produce a large number of images as the output, hence making it very resource consuming to manually detect the animals. In this paper, we present a new dataset of wild ungulates which was collected in Latvia. Moreover, we demonstrate two methods, which use RetinaNet and Faster R-CNN as backbones, respectively, to detect the animals in the images. We discuss the optimization of training and impact of data augmentation on the performance. Finally, we show the result of aforementioned tune networks over the real world data collected in Latvia.Entities:
Keywords: Faster R-CNN; RetinaNet; animal detection; camera traps; ungulates
Year: 2022 PMID: 35327863 PMCID: PMC8947003 DOI: 10.3390/e24030353
Source DB: PubMed Journal: Entropy (Basel) ISSN: 1099-4300 Impact factor: 2.524
Figure 1Faster R-CNN predictions before non-maximum suppression. Faster R-CNN produces redundant, overlapping bounding boxes and bounding boxes with low confidence scores. The orange rectangles show the model predictions and the green rectangle shows the ground-truth bounding box.
Figure 2Faster R-CNN predictions after NMS is applied with threshold filtering and different species overlapping criterion. Image reflection is interpreted as the part of the animal.
Test dataset demography.
| Species or Group Name | Scientific Name | Number of Annotations |
|---|---|---|
| Deer | Cervidae | 516 |
| Wild boar | Sus Scrofa | 526 |
| Other species | 86 | |
| Total count | 1128 | |
Figure 3Test dataset samples.
Training dataset demography.
| Species or Group Name | Scientific Name | Number of Annotations | Number of Augmented Samples | Total Number of Annotations |
|---|---|---|---|---|
| Deer | Cervidae | 6970 | 0 | 6970 |
| Wild boar | Sus Scrofa | 2642 | 4328 | 6970 |
| Total count | 9612 | 4328 | 13,940 |
Figure 4Example of the processed camera trap image with the visualized ground-truth bounding box.
Experiment 1: mAP evaluation.
| Model | Metrics | Number of Iterations | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| 3485 | 6970 | 10,455 | 13,940 | 17,425 | 20,910 | 24,395 | 27,880 | 31,365 | 34,850 | ||
| Faster R-CNN | mAP @0.5:0.05:0.95 | 0.1832 |
|
|
|
|
|
|
|
|
|
| mAP “deer” | 0.1420 | 0.2584 | 0.2288 | 0.1062 | 0.1164 | 0.2336 | 0.1684 | 0.1877 | 0.1539 | 0.1576 | |
| mAP “boar” | 0.2244 | 0.2810 | 0.2187 | 0.2764 | 0.2532 | 0.2900 | 0.3417 | 0.3021 | 0.2942 | 0.3589 | |
| mAP@0.5 | 0.3229 | 0.4561 | 0.3934 | 0.3305 | 0.3148 | 0.4562 | 0.4073 | 0.4065 | 0.3776 | 0.4204 | |
|
|
|
|
|
|
|
|
|
|
|
| |
| mAP “boar” |
|
|
|
|
|
|
|
|
|
| |
| mAP @0.75 | 0.1932 | 0.2860 | 0.2229 | 0.1926 | 0.1959 | 0.2855 | 0.2758 | 0.2571 | 0.2488 | 0.2756 | |
|
|
|
|
|
|
|
|
|
|
|
| |
|
| 0.2496 | 0.3048 | 0.2235 | 0.2881 | 0.2743 | 0.3492 | 0.3857 | 0.3260 | 0.3521 | 0.3976 | |
|
|
|
|
|
|
|
|
|
|
|
|
|
| mAP “deer” | 0.1287 | 0.1884 | 0.1715 | 0.1791 | 0.1827 | 0.1757 | 0.2053 | 0.2098 | 0.1551 | 0.1844 | |
| mAP “boar” | 0.3029 | 0.2148 | 0.3111 | 0.2301 | 0.2865 | 0.3231 | 0.3266 | 0.2630 | 0.2853 | 0.2540 | |
| mAP@0.5 | 0.3740 | 0.3725 | 0.4133 | 0.3574 | 0.4134 | 0.4198 | 0.4364 | 0.4173 | 0.3738 | 0.3922 | |
|
|
|
|
|
|
|
|
|
|
|
| |
| mAP “boar” |
|
|
|
|
|
|
|
|
|
| |
| mAP @0.75 | 0.1996 | 0.1909 | 0.2483 | 0.2179 | 0.2666 | 0.2678 | 0.2890 | 0.2421 | 0.2341 | 0.2236 | |
|
|
|
|
|
|
|
|
|
|
|
| |
| mAP “boar” | 0.2963 | 0.2345 | 0.3467 | 0.2716 | 0.3487 | 0.3724 | 0.3628 | 0.2913 | 0.3240 | 0.2915 | |
Figure 5RetinaNet mAP@0.5:0.05:0.95 for “boar” and “deer” classes.
Figure 6Faster R-CNN mAP@0.5:0.05:0.95 for “boar” and “deer” classes.
Experiment 2: mAP evaluation.
| Metrics | Model | ||
|---|---|---|---|
| RetinaNet | RetinaNet | RetinaNet Pre-Trained | |
| mAP @0.5:0.05:0.95 | 0.2158 | 0.1695 | 0.2290 |
| mAP “deer” | 0.1287 | 0.1492 | 0.1953 |
| mAP “boar” | 0.3029 | 0.1897 | 0.2626 |
| mAP@0.5 | 0.3740 | 0.2989 | 0.4029 |
| mAP “deer” | 0.2727 | 0.2900 | 0.3758 |
| mAP “boar” | 0.4752 | 0.3078 | 0.4299 |
| mAP @0.75 | 0.1996 | 0.1688 | 0.2265 |
| mAP “deer” | 0.1028 | 0.1441 | 0.1714 |
| mAP “boar” | 0.2963 | 0.1935 | 0.2815 |
Figure 7Prediction examples.
Comparison with the state of the art.
| Metrics | Model | |||
|---|---|---|---|---|
| Ours with | Ours with | YOLOv4 [ | SSD [ | |
| mAP @0.5:0.05:0.95 | 0.2659 | 0.2697 | 0.2295 | 0.2084 |
| mAP @0.5 | 0.4364 | 0.4562 | 0.4010 | 0.3897 |
| mAP @0.75 | 0.2890 | 0.2860 | 0.2545 | 0.2410 |
Figure 8Samples of challenging captures.