| Literature DB >> 29518076 |
Oisin Mac Aodha1, Rory Gibb2, Kate E Barlow3, Ella Browning2,4, Michael Firman1, Robin Freeman4, Briana Harder5, Libby Kinsey1, Gary R Mead6, Stuart E Newson7, Ivan Pandourski8, Stuart Parsons9, Jon Russ10, Abigel Szodoray-Paradi11, Farkas Szodoray-Paradi11, Elena Tilova12, Mark Girolami13, Gabriel Brostow1, Kate E Jones2,4.
Abstract
Passive acoustic sensing has emerged as a powerful tool for quantifying anthropogenic impacts on biodiversity, especially for echolocating bat species. To better assess bat population trends there is a critical need for accurate, reliable, and open source tools that allow the detection and classification of bat calls in large collections of audio recordings. The majority of existing tools are commercial or have focused on the species classification task, neglecting the important problem of first localizing echolocation calls in audio which is particularly problematic in noisy recordings. We developed a convolutional neural network based open-source pipeline for detecting ultrasonic, full-spectrum, search-phase calls produced by echolocating bats. Our deep learning algorithms were trained on full-spectrum ultrasonic audio collected along road-transects across Europe and labelled by citizen scientists from www.batdetective.org. When compared to other existing algorithms and commercial systems, we show significantly higher detection performance of search-phase echolocation calls with our test sets. As an example application, we ran our detection pipeline on bat monitoring data collected over five years from Jersey (UK), and compared results to a widely-used commercial system. Our detection pipeline can be used for the automatic detection and monitoring of bat populations, and further facilitates their use as indicator species on a large scale. Our proposed pipeline makes only a small number of bat specific design decisions, and with appropriate training data it could be applied to detecting other species in audio. A crucial novelty of our work is showing that with careful, non-trivial, design and implementation considerations, state-of-the-art deep learning methods can be used for accurate and efficient monitoring in audio.Entities:
Mesh:
Year: 2018 PMID: 29518076 PMCID: PMC5843167 DOI: 10.1371/journal.pcbi.1005995
Source DB: PubMed Journal: PLoS Comput Biol ISSN: 1553-734X Impact factor: 4.475
Fig 1Detection pipeline for search-phase bat echolocation calls.
(a) Raw audio files are converted into a spectrogram using a Fast Fourier Transform (b). Files are de-noised (c), and a sliding window Convolutional Neural Network (CNN) classifier (d, yellow box) produces a probability for each time step. Individual call detection probabilities using non-maximum suppression are produced (e, green boxes), and the time in file of each prediction along with the classifier probability are exported as text files.
Fig 2Spatial distribution of the BatDetect CNNs training and testing datasets.
(a) Location of training data for all experiments and one test dataset in Romania and Bulgaria (2006–2011) from time-expanded (TE) data recorded along road transects by the Indicator Bats Programme (iBats) [7], where red and black points represent training and test data, respectively. (b) Locations of additional test datasets from TE data recorded as part of iBats car transects in the UK (2005–2011), and from real-time recordings from static recorders from the Norfolk Bat Survey from 2015 (inset). Points represent the start location of each snapshot recording for each iBats transect or locations of static detectors for the Norfolk Bat Survey.
Average precision and recall results for bat search-phase call detection algorithms across three different test sets iBats Romania and Bulgaria; iBats UK; and Norfolk Bat Survey.
| Detection Algorithms | |||||||
|---|---|---|---|---|---|---|---|
| BatDetect | |||||||
| Average Precision | SonoBat | SCAN’R | Kaleidoscope | Segment | Random Forest | CNNFAST | CNNFULL |
| iBats (R&B) | 0.265 | 0.239 | 0.189 | 0.299 | 0.674 | 0.863 | |
| iBats (UK) | 0.200 | 0.142 | 0.144 | 0.324 | 0.648 | 0.781 | |
| NBP (Norfolk) | 0.473 | 0.456 | 0.553 | 0.506 | 0.630 | 0.861 | |
| iBats (R&B) | 0 | 0.251 | 0 | 0 | 0.568 | 0.777 | |
| iBats (UK) | 0 | 0 | 0 | 0 | 0.324 | 0.570 | |
| NBP (Norfolk) | 0.184 | 0.470 | 0 | 0 | 0.049 | 0.754 | |
Large numbers indicate better performance. Recall results are reported at 0.95 precision, where zero indicates that the detector algorithm was unable to achieve a precision greater than 0.95 at any recall level. The results for the best performing algorithm are underlined. Details of the test datasets and detection algorithms are given in the text.
Fig 3Precision-recall curves for bat search-phase call detection algorithms across three testing datasets; (a) iBats Romania and Bulgaria; (b) iBats UK; and (c) Norfolk Bat Survey. Curves were obtained by sweeping the output probability for a given detector algorithm and computing the precision and recall at each threshold. The commercial systems or algorithms that did not return a continuous output or probability (SCAN’R, Segment, and Kaleidoscope) were depicted as a single point.
Fig 4Comparison of the predicted bat detections (calls and passes) for two different acoustic systems using monitoring data collected from Jersey, UK.
Acoustic systems used were SonoBat (version 3.1.7p) [43] using analysis in [49], and BatDetect CNNFAST using a probability threshold of 0.90. Detections are shown within each box plot, where the black line represents the mean across all transect sampling events from 2011–2015, boxes represent the middle 50% of the data, whiskers represent variability outside the upper and lower quartiles, with outliers plotted as individual points. See text for definition of a bat pass.