| Literature DB >> 26528120 |
Cheston Tan1, Stephane Lallee1, Garrick Orchard2.
Abstract
Neuromorphic Vision sensors have improved greatly since the first silicon retina was presented almost three decades ago. They have recently matured to the point where they are commercially available and can be operated by laymen. However, despite improved availability of sensors, there remains a lack of good datasets, while algorithms for processing spike-based visual data are still in their infancy. On the other hand, frame-based computer vision algorithms are far more mature, thanks in part to widely accepted datasets which allow direct comparison between algorithms and encourage competition. We are presented with a unique opportunity to shape the development of Neuromorphic Vision benchmarks and challenges by leveraging what has been learnt from the use of datasets in frame-based computer vision. Taking advantage of this opportunity, in this paper we review the role that benchmarks and challenges have played in the advancement of frame-based computer vision, and suggest guidelines for the creation of Neuromorphic Vision benchmarks and challenges. We also discuss the unique challenges faced when benchmarking Neuromorphic Vision algorithms, particularly when attempting to provide direct comparison with frame-based computer vision.Entities:
Keywords: benchmarking; computer vision; datasets; neuromorphic vision; sensory processing
Year: 2015 PMID: 26528120 PMCID: PMC4602133 DOI: 10.3389/fnins.2015.00374
Source DB: PubMed Journal: Front Neurosci ISSN: 1662-453X Impact factor: 4.677
Figure 1Comparison of data formats between CV and temporal contrast NV for a rotating black bar stimulus. Left: simulated 30fps recording of a black spinning bar. Every pixel's intensity value is captured at constant time intervals. The location of the bar in each frame can be seen, but the location of the bar between frames must be inferred. Right: Actual recording of a rotating bar captured with a NV sensor. Blue and red points indicate on (increasing intensity) and off (decreasing intensity) events respectively. Each pixel immediately outputs data when it detects a change in intensity. Middle: Superposition of the NV and CV data. The NV sensor captures intensity changes occurring between the CV frames (blue and red data points), but does not recapture redundant background pixel intensities (indicated by transparent frame regions) as is done in the CV format. A video accompanying this figure is available online.
Figure 2Examples of NV data. (A) Publicly available unannotated sequences. From left to right (both rows) dashcam recordings, static surveillance of street scenes, test stimuli, juggling and first person walking video, high speed eye tracking and dot tracking, and rat and fly behavioral recordings. (B) A small annotated card pip dataset consisting of 10 examples of each of the 4 card pips (Pérez-Carrasco et al., 2013). (C) Another small annotated dataset, consisting of 2 examples of each of the 36 characters (Orchard et al., 2015).