| Literature DB >> 24918777 |
Kristijn R R Swinnen1, Jonas Reijniers1, Matteo Breno1, Herwig Leirs1.
Abstract
Camera traps have proven very useful in ecological, conservation and behavioral research. Camera traps non-invasively record presence and behavior of animals in their natural environment. Since the introduction of digital cameras, large amounts of data can be stored. Unfortunately, processing protocols did not evolve as fast as the technical capabilities of the cameras. We used camera traps to record videos of Eurasian beavers (Castor fiber). However, a large number of recordings did not contain the target species, but instead empty recordings or other species (together non-target recordings), making the removal of these recordings unacceptably time consuming. In this paper we propose a method to partially eliminate non-target recordings without having to watch the recordings, in order to reduce workload. Discrimination between recordings of target species and non-target recordings was based on detecting variation (changes in pixel values from frame to frame) in the recordings. Because of the size of the target species, we supposed that recordings with the target species contain on average much more movements than non-target recordings. Two different filter methods were tested and compared. We show that a partial discrimination can be made between target and non-target recordings based on variation in pixel values and that environmental conditions and filter methods influence the amount of non-target recordings that can be identified and discarded. By allowing a loss of 5% to 20% of recordings containing the target species, in ideal circumstances, 53% to 76% of non-target recordings can be identified and discarded. We conclude that adding an extra processing step in the camera trap protocol can result in large time savings. Since we are convinced that the use of camera traps will become increasingly important in the future, this filter method can benefit many researchers, using it in different contexts across the globe, on both videos and photographs.Entities:
Mesh:
Year: 2014 PMID: 24918777 PMCID: PMC4053333 DOI: 10.1371/journal.pone.0098881
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Classification matrix of the recordings.
| Reality | |||
| Non-target recordings | Target recordings | ||
| Classification result | Non-Target recordings | True Positive | False Positive |
| Target recordings | False Negative | True Negative | |
1991 videos were recorded at 12 different locations in 9 different beaver territories, in the province of Limburg, in the east of Flanders, Belgium, between 20 July 2012 and 8 October 2012. We recorded 1043 recordings of the target species, the beaver, 553 empty recordings and 395 recordings of non-target species. Every recording was classified based on D = “the amount of pixel variation” as target or non-target recording. The correct classification of non-target recordings was considered to be a success (True positive, TP) since these recordings can be correctly discarded. This value must be as high as possible in order to remove the maximum amount of non-target recordings. False positives (FP) were the beavers (target recordings) which were classified as non-target recordings and wrongly discarded. This number must be as small as possible since valuable data is being discarded. False negatives (FN) were non-target recordings which were classified as being target recordings. True negatives (TN) were the target recordings which were recognized as being target recordings.
Figure 1Possible gain (true positive rate, TP-rate) given an accepted loss (false positive rate, FP-rate).
The FP-rate represents the proportion of target recordings (beavers) classified as non-target recordings. The TP-rate is the proportion of non-target recordings correctly classified as non-target. This is the proportion of non-target recordings that will be discarded correctly given a certain FP-rate. The best performing filter maximizes the TP-rate while minimizing the FP-rate. Filter 2 performs better in all environmental circumstances. The dashed diagonal represents the outcome of a random model which cannot discriminate between target and non-target recordings. The dashed vertical line represents a 5% threshold (FP-rate). Dry<10% water area in footage (5 locations, n = 933 recordings), Wet>10% water area (7 locations, n = 1058 recordings), Complete dataset is the combined Dry and Wet dataset (12 locations, n = 1991 recordings).
A bootstrap analysis was performed for both filter methods (Filter 1 and Filter 2) on the complete dataset (n = 1991), on the videos recorded at dry locations (n = 933) and on the videos recorded at wet locations (n = 1058).
| Dry | Wet | Complete dataset | |||||
| Filter 1 | Filter 2 | Filter 1 | Filter 2 | Filter 1 | Filter 2 | ||
| False Positive Rate | Mean | 0.051 | 0.051 | 0.050 | 0.051 | 0.051 | 0.051 |
| Sd | 0.009 | 0.009 | 0.011 | 0.010 | 0.012 | 0.011 | |
| 2.5 percentile | 0.035 | 0.035 | 0.029 | 0.032 | 0.029 | 0.029 | |
| 97.5 percentile | 0.068 | 0.067 | 0.072 | 0.072 | 0.074 | 0.074 | |
A subsample of 500 videos was randomly sampled with replacement and this was repeated 1000 times. Recordings were classified based on their D-value; the thresholds were chosen to result in a 5% false positive rate in the respective datasets (see text). The new mean, standard deviation (Sd), 2.5 and 97.5 percentile are shown.