| Literature DB >> 34153121 |
Laura Mannocci1,2,3, Sébastien Villon2, Marc Chaumont3,4, Nacim Guellati1, Nicolas Mouquet1,5, Corina Iovan2, Laurent Vigliola2, David Mouillot1,6.
Abstract
Deep learning has become a key tool for the automated monitoring of animal populations with video surveys. However, obtaining large numbers of images to train such models is a major challenge for rare and elusive species because field video surveys provide few sightings. We designed a method that takes advantage of videos accumulated on social media for training deep-learning models to detect rare megafauna species in the field. We trained convolutional neural networks (CNNs) with social media images and tested them on images collected from field surveys. We applied our method to aerial video surveys of dugongs (Dugong dugon) in New Caledonia (southwestern Pacific). CNNs trained with 1303 social media images yielded 25% false positives and 38% false negatives when tested on independent field video surveys. Incorporating a small number of images from New Caledonia (equivalent to 12% of social media images) in the training data set resulted in a nearly 50% decrease in false negatives. Our results highlight how and the extent to which images collected on social media can offer a solid basis for training deep-learning models for rare megafauna detection and that the incorporation of a few images from the study site further boosts detection accuracy. Our method provides a new generation of deep-learning models that can be used to rapidly and accurately process field video surveys for the monitoring of rare megafauna.Entities:
Keywords: convolutional neural networks; detección de especies; ecología de internet; endangered megafauna; internet ecology; megafauna en peligro; monitoreo; monitoring; redes neurales convolucionales; species detection
Mesh:
Year: 2021 PMID: 34153121 PMCID: PMC9291111 DOI: 10.1111/cobi.13798
Source DB: PubMed Journal: Conserv Biol ISSN: 0888-8892 Impact factor: 7.563
FIGURE 1Steps in the deep‐learning method that detects rare megafauna (CNN, convolutional neural network; TP, true positive; FP, false positive; FN, false negative). Social media images (gray) are used in steps 2 and 3. Field images (black) are used in steps 2, 4, and 5
Overview of social‐media (WEB)* and field‐video (ULM) databases for dugong detection
| Database | Number of videos | Mean duration (minutes: seconds) | Total duration (hours: minutes: seconds) | Number of images | Number of images with ≥1 dugong | Mean (SD) image width (pixels) | Mean (SD) image height (pixels) | Mean (SD) bounding box width (pixels) | Mean (SD) bounding box height (pixels) |
|---|---|---|---|---|---|---|---|---|---|
| WEB | 22 | 1: 7 | 0: 24: 38 | 1512 | 1303 | 1938 (443) | 1119 (225) | 155 (112) | 144 (112) |
| ULM | 57 | 11: 45 | 10: 52: 23 | 42464 | 161 | 2704 (0) | 1520 (0) | 40 (16) | 39 (17) |
Details in Appendix S1.
FIGURE 2(a) Map of the Poé Lagoon study area in New Caledonia and (b) examples of dugong images collected by ULM. Imagery source for the map: Google Earth
FIGURE 3Results of dugong detection for the baseline run (trained with social media images only) applied to test field images: left graph, mean percentage of true positives (TPs) and false positives (FPs) in the predictions; right graph, mean percentage of TPs and FNs (false negatives) in observations (error bars, SD). Images are examples for a TP (green) (predicted bounding box associated with an annotated bounding box [white]), an FP (red) (predicted bounding box not corresponding to an annotated bounding box; here a coral patch), and an FN (annotated bounding box not corresponding to a predicted bounding box)
FIGURE 4Mean precision, recall, and f1 score (balance between false positives and false negatives) of convolutional neural networks detecting dugongs in test field (ULM) images (continuous line) versus test social media (WEB) images (dashed line) for all runs (R0, training with WEB images only; R2–R12, training with WEB images mixed with a number of ULM images equivalent to 2–12% of WEB images) (shading, SD of metrics evaluated on test ULM images). Values of all performance metrics are in Appendix S5
FIGURE 5Mean precision–recall curves for all runs (R0–R12) of deep‐learning models calculated based on the test field images of dugongs collected at Poé Lagoon (dot, threshold value from 50% to 90%). Recall is the metric to maximize when detecting rare species, such as the dugong (lower right corner of graph)