| Literature DB >> 35237122 |
Moritz B Milde1, Saeed Afshar1, Ying Xu1, Alexandre Marcireau1, Damien Joubert1, Bharath Ramesh1, Yeshwanth Bethi1, Nicholas O Ralph1, Sami El Arja1, Nik Dennler1, André van Schaik1, Gregory Cohen1.
Abstract
Neuromorphic engineering aims to build (autonomous) systems by mimicking biological systems. It is motivated by the observation that biological organisms-from algae to primates-excel in sensing their environment, reacting promptly to their perils and opportunities. Furthermore, they do so more resiliently than our most advanced machines, at a fraction of the power consumption. It follows that the performance of neuromorphic systems should be evaluated in terms of real-time operation, power consumption, and resiliency to real-world perturbations and noise using task-relevant evaluation metrics. Yet, following in the footsteps of conventional machine learning, most neuromorphic benchmarks rely on recorded datasets that foster sensing accuracy as the primary measure for performance. Sensing accuracy is but an arbitrary proxy for the actual system's goal-taking a good decision in a timely manner. Moreover, static datasets hinder our ability to study and compare closed-loop sensing and control strategies that are central to survival for biological organisms. This article makes the case for a renewed focus on closed-loop benchmarks involving real-world tasks. Such benchmarks will be crucial in developing and progressing neuromorphic Intelligence. The shift towards dynamic real-world benchmarking tasks should usher in richer, more resilient, and robust artificially intelligent systems in the future.Entities:
Keywords: ATIS; DAVIS; DVS; audio; benchmarks; event-based systems; neuromorphic engineering; olfaction
Year: 2022 PMID: 35237122 PMCID: PMC8884247 DOI: 10.3389/fnins.2022.813555
Source DB: PubMed Journal: Front Neurosci ISSN: 1662-453X Impact factor: 4.677
Figure 1Different modes of sensing. Sensing and consequently processing of sensory information can be divided into passive (top, A and B) vs. active (bottom, C and D), as well as open- (left, A and C) vs. closed-loop (right, B and D) sensing. Open-loop passive sensing (A) is the most prevalent form of acquiring information about the environment and subsequently using this information, e.g., to classify objects. Advantages of this approach include the one-to-one mapping of inputs and outputs and the readily available optimisation schemes that obtain such a mapping. Examples for open-loop passive sensing include surveillance applications, face recognition, object localisation, and most conventional computer vision applications. While the environment and/or the sensor could move, the trajectory itself is independent of the acquired information. Open-loop active sensing (C) is characterised by injecting energy into the environment. The acquired data is a combination of information emitted by the environment itself (black arrow) and the resulting interaction of the signal emitted by the sensor with the environment (red arrow). Prime examples of this sensing approach are LiDAR (LiDAR), RADAR, or SONAR. In the open-loop setting, the acquired information is not used to change parameters of the sensor itself. The closed-loop passive sensing strategy (B) is most commonly found in animals, including humans. While energy is solely emitted by the environment, the acquired information is used to actively change the relative position of the sensor (e.g., saccadic eye movements) or alter the sensory parameters (e.g., focus). This closed-loop approach utilises past information to make informed decisions in the future. The last sensing category is active closed-loop sensing (D) where the acquired information is used to alter the positioning and configuration of the sensor. Bats (Griffin, 1958; Fenton, 1984) and weakly electric fish (Flock and Wersäll, 1962; Hofmann et al., 2013) are prime examples from the animal kingdom that exploit this sensing style, but also artificial systems, such as adaptive LiDAR, use acquired information about the environment to perform more focused and dense information collection from subsequent measurements.
Figure 2Existing datasets and benchmarks fall into two categories: open-loop benchmarks, or datasets, and closed-loop benchmarks. Supervised machine learning relies mostly on the first category, whereas reinforcement learning requires the second. Most existing neuromorphic engineering benchmarks fall in the first category. This article pleads in favour of closed-loop neuromorphic benchmarks.
Figure 3Overview of existing open- and closed-loop datasets and benchmarks for conventional time-varying and neuromorphic time-continuous approaches to machine intelligence. Distribution of high-end challenges according to the research field (neuromorphic/conventional), their interactions with the environment (open- and closed-loop), and the sensing modality. Downward triangle: conventional frame-based cameras; Diamond: neuromorphic event-based cameras; Star: Combination of conventional frame-and neuromorphic event-based cameras; Pentagon: auditory sensors; Square: olfactory sensors; Triangle: LiDAR sensors; Circles: abstract games operating directly on machine code. Further details are provided in Tables 1, 2. While not being completely exhaustive, this figure underlines the gravitation of both machine and neuromorphic intelligence community towards open-loop datasets. In order to showcase and truly contribute to the advancement of machine intelligence, the neuromorphic community needs to focus their efforts on creating closed-loop neuromorphic benchmarks that are physically embedded in their environment and thus dictate a hard power and execution time constraint. While the physical set-ups in Moeys et al. (2016) and Conradt et al. (2009) could have formed the basis of closed-loop benchmarks, they were not developed as such. In Moeys et al. (2016), the set-up was used to generate an open loop static dataset and in Conradt et al. (2009), no dataset was generated. In contrast, the benchmarks advocated here would be available as physical experimental set-ups that can be accessed by the community for algorithm testing.
Conventional Benchmark Datasets for various sensor modalities.
|
|
|
| |
|---|---|---|---|
| IMDB-Wiki | Frames (-) | Rothe et al. ( | |
| Kinetics-700 | Frames (-) | Kay et al. ( | |
| MS Coco | Frames (-) | Lin et al. ( | |
| Pascal VOC | Frames (-) | Everingham et al. ( | |
| MPII Human Pose | Frames (-) | Andriluka et al. ( | |
| YouTube-8M | Frames (-) | Abu-El-Haija et al. ( | |
| MNIST | Frames (28x28) | LeCun et al. ( | |
| Fashion-MNIST | Frames (28x28) | Xiao et al. ( | |
| CIFAR-10 & -100 | Frames (32x32) | Torralba et al. ( | |
| Caltech-101 & -256 | Frames (32x32) | Fei-Fei et al. ( | |
| Open-Loop | IMageNet | Frames (482x418) | Jia Deng et al. ( |
| Cityscapes | Frames (1600x1200) & HDR | Cordts et al. ( | |
| KITTI | Frames (1382x512) & LiDAR | Geiger et al. ( | |
| BDD100K | Frames (720x1280) | Yu et al. ( | |
| Oxford RobotCar | Frames (1280x960) & LiDAR | Maddern et al. ( | |
| LiDAR-Video Driving | Frames (1920x1080) & LiDAR | Chen et al. ( | |
| FSDD | Microphone (1 Ch. @ 8kHz) | Jackson et al. ( | |
| AudioSet | Microphone (-) | Gemmeke et al. ( | |
| TIDIGITS | Microphone (1 Ch. @ 20 kHz) | Leonard and Doddington ( | |
| TIMIT | Microphone (1 Ch. @ 16 kHz) | Garofolo et al. ( | |
| VoxCeleb | Microphone (1 Ch. @ 16 kHz) | Nagrani et al. ( | |
| DCASE 2020 | Microphone (Mult. Ch. @ 24 kHz) | Politis et al. ( | |
| ToyADMOS | Microphone (4 Ch. @ 48 kHz) | Koizumi et al. ( | |
| Mivia Audio Events | Microphone (1 Ch. @ 32 kHz) | Foggia et al. ( | |
| Million Song | Microphone (1-2 Ch. @ 22-44 kHz) | Bertin-Mahieux et al. ( | |
| MOx Open Sampling | Olfaction (9x8 Ch. @ 100 Hz) | Vergara et al. ( | |
| MOx Turbulent Mixture | Olfaction (8 Ch. @ 50 Hz) | Fonollosa et al. ( | |
| MOx Temperature Modulation | Olfaction (14 Ch. @ 3.5 Hz) | Burgués et al. ( | |
| MOx Flow Modulation | Olfaction (16 Ch. @ 25 Hz) | Ziyatdinov et al. ( | |
| Closed-Loop | Atari57 | Game (-) | Badia et al. ( |
| Atari 2600 | Game (210x160) | Bellemare et al. ( | |
| AlphaGo Zero | Game (-) | Silver et al. ( | |
| AlphaZero | Game (-) | Silver et al. ( | |
| AlphaStar | Game (-) | Vinyals et al. ( | |
| Autonomous Agent | Simulation (-) | Jordan et al. ( | |
| Driving simulator | Simulation (160x320) | Santana and Hotz ( | |
| Grasping Robot | Frames (-) | Stewart et al. ( |
Neuromorphic Benchmark Datasets for various sensor modalities.
|
|
|
| |
|---|---|---|---|
| Pedestrian detection | DAVIS 346 (346x240) | Miao et al. ( | |
| Space Dataset | DAVIS 240 (240x180) & ATIS (304x240) | Cohen et al. ( | |
| DVSFLOW16 | DAVIS 240 (240x180) | Rueckauer and Delbruck ( | |
| Visual navigation | DAVIS 240 C (240x180) | Barranco et al. ( | |
| Action recognition | DAVIS 346 (346x240) | Miao et al. ( | |
| Multi-vehicle detection | DAVIS 346 (346x240) | Chen et al. ( | |
| DHP19 | DAVIS 346 (346x240) | Calabrese et al. ( | |
| Fall detection | DAVIS 346 (346x240) | Miao et al. ( | |
| DDD17 | DAVIS 346 B (346x240) | Binas et al. ( | |
| DDD20 | DAVIS 346 B (346x240) | Hu et al. ( | |
| ColorEvents | ColorDAVIS 346 (346x240) | Scheerlinck et al. ( | |
| Closed-Loop | 1Mpx Automotive Detection Dataset | High-resolution EBC (Finateu et al., | Perot et al. ( |
| DSEC | PPS3MVCD (640x480) | Gehrig et al. ( | |
| RaShamBo | DVS (64x64) | Lungu et al. ( | |
| 36Characters | DVS (128x128) | Orchard et al. ( | |
| MNIST-DVS | DVS (128x128) | Serrano-Gotarredona and Linares-Barranco ( | |
| Poker-DVS | DVS (128x128) | Serrano-Gotarredona and Linares-Barranco ( | |
| VOT2015 | DVS (128x128) | Hu et al. ( | |
| Tracking Dataset | DVS (128x128) | Hu et al. ( | |
| UCF-50 | DVS (128x128) | Hu et al. ( | |
| CALTECH256 | DVS (128x128) | Hu et al. ( | |
| Human silhouette | DVS (128x128) | Pérez-Carrasco et al. ( | |
| Human posture | DVS (128x128) | Zhao et al. ( | |
| N-Caltech101 | ATIS (304x240) | Orchard et al. ( | |
| N-MNIST | ATIS (304x240) | Orchard et al. ( | |
| Human activity recognition | ATIS (346x240) | Pradhan et al. ( | |
| N-TIDIGITS18 | DAS (64x2x4) | Anumula et al. ( | |
| WHISPER | Microphone (16) | Ceolini et al. ( | |
| COBRA | Olfaction (-) | Schneider and Schneider ( | |
| Closed-Loop | PRED18 | DAVIS240C (240x180) | Moeys et al. ( |
| Pencil Balancing Robot | DVS (128x128) | Conradt et al. ( | |
Figure 4Schematic of the closed-loop robotic foosball setup.