Literature DB >> 35957244

Visual Odometry with an Event Camera Using Continuous Ray Warping and Volumetric Contrast Maximization.

Yifu Wang¹, Jiaqi Yang¹, Xin Peng¹, Peng Wu¹, Ling Gao¹, Kun Huang¹, Jiaben Chen¹, Laurent Kneip^1,2.

Abstract

We present a new solution to tracking and mapping with an event camera. The motion of the camera contains both rotation and translation displacements in the plane, and the displacements happen in an arbitrarily structured environment. As a result, the image matching may no longer be represented by a low-dimensional homographic warping, thus complicating an application of the commonly used Image of Warped Events (IWE). We introduce a new solution to this problem by performing contrast maximization in 3D. The 3D location of the rays cast for each event is smoothly varied as a function of a continuous-time motion parametrization, and the optimal parameters are found by maximizing the contrast in a volumetric ray density field. Our method thus performs joint optimization over motion and structure. The practical validity of our approach is supported by an application to AGV motion estimation and 3D reconstruction with a single vehicle-mounted event camera. The method approaches the performance obtained with regular cameras and eventually outperforms in challenging visual conditions.

Entities: Chemical

Keywords: SLAM; computer vision; event-based vision; visual localization and mapping

Mesh：
Algorithms
Rotation

Year: 2022 PMID： 35957244 PMCID： PMC9370870 DOI： 10.3390/s22155687

Source DB: PubMed Journal: Sensors (Basel) ISSN： 1424-8220 Impact factor: 3.847

1. Introduction

Vision-based localization and mapping is an important technology with many applications in robotics, intelligent transportation, and intelligence augmentation. Although several decades of active research have led to a certain level of maturity, we keep facing challenges in scenarios with high dynamics, low texture distinctiveness, or challenging illumination conditions [1,2]. Event cameras—also called dynamic vision sensors—present an interesting alternative in this regard, as they pair High Dynamic Range (HDR) with high temporal resolution. The advantages and challenges of event-based vision are well explained by the original work of Brandli et al. [3] as well as the recent survey by Gallego et al. [4]. Previous works have employed time-continuous parametrizations of image warping functions. Based on the assumption that events are pre-dominantly triggered by high-gradients edges in the image, the optimal image warping parameters will cause the events to warp onto a sharp edge map in a reference view called the Image of Warped Events (IWE). The optimal warping parameters are hence found by maximizing contrast in the IWE. Various reward functions to evaluate contrast have been presented and analyzed in the recent works of Gallego et al. [5,6] and Stoffregen and Kleeman [7], and successfully used for solving a variety of problems with event cameras such as optical flow [8,9,10,11,12,13], segmentation [14,15,16], 3D reconstruction [17,18,19], and motion estimation [20,21,22,23,24,25,26,27]. The main problem with the construction of the IWE is that it relies on a low-dimensional image-to-image warping function, which—in the case of both translational and rotational displacements—is only possible if the model is homographic or if knowledge about the depth of the scene is prior available. Past solutions to event-based localization and mapping, therefore, looked at alternative solution attempts. Note that there are lots of works on the localization and mapping problems individually, a listing of which would go beyond the scope of this introduction. Here we only focus on combined solutions to both problems that use only a single event camera and that can handle combined rotational and translational displacements in unknown, arbitrarily structured environments. There are surprisingly few works that solve this problem, which is proof of its difficulty. The first solution to this problem is given by Kim et al. [28], who propose a complex framework of three individual filters. Results are limited to small-scale environments and small, dedicated motions. A geometric attempt is given by Rebecq et al. [29], who present a combination of a tracker and their ray-density-based structure extraction method EMVS [17]. However, the framework alternates between the tracking and mapping solutions, which leaves open questions as to how to bootstrap the system safely. Zhu et al. [30] finally present a promising learning-based approach. However, it depends on vast amounts of training data and provides no guarantees of optimality or generality. Our work makes the following contributions: We perform contrast maximization in 3D. Using a time-continuous trajectory model, the 3D location of the landmarks corresponding to events is modelled by time-continuous ray warping in space, and the optimal motion parameters are found by maximizing contrast within a volumetric ray density field, denoted by Volume of Warped Events (VWE). Our method is the first to perform joint optimization over motion and structure for event cameras exerting both translational and rotational planar displacements in an arbitrarily structured environment. We successfully apply our framework to Autonomous Ground Vehicle (AGV) motion estimation with a forward-facing event camera. We prove that by using only an event camera, we can provide good quality, continuous visual localization, and mapping results able to compete with regular camera alternatives, especially as visual conditions degrade.

2. Contrast Maximization

We are given a set of N events happening over a certain time interval, where each event is defined by its image location , timestamp , and polarity . Note that the set is ordered, meaning that if , then . We furthermore assume that image warping during the entire time interval can be parametrized as a continuous-time function of a certain parameter vector , and define the warping function that warps an event with location and timestamp into a reference view at . Gallego et al. recently proposed a unifying framework for solving motion estimation problems with event cameras [5]. If the motion is estimated correctly, events that are triggered by the same point will be accumulated by the same pixel in the reference view, and the resulting Image of Warped Events (IWE) will, therefore, become a sharp edge map. The question is how the accumulation is done and how the sharpness of the IWE is characterized. Gallego et al. propose to optimize the alignment of the events by maximizing the contrast in the IWE. Formally, the IWE at point is defined by , and it is evaluated discretely for each pixel center location. While the application of a Gaussian kernel makes sure that events that are closer to a certain pixel contribute more than events that are further away, it also makes sure that the IWE and its contrast remain smooth functions of the motion parameters and thus optimizable through gradient-based methods. According to [6,7], the contrast or sharpness of the IWE may finally be evaluated using one of several possible focus loss functions. Here we use the perhaps most common one, given by the IWE variance . is the mean value of I, the number of pixels in I, and i and j are indices that loop through all the rows and columns of the IWE. As shown in the heat maps of [6], the highest variance of the IWE gives the highest contrast location, and thus the optimal motion parameters cause the best alignment of the warped events. The framework allows us to tackle several important motion estimation problems for event-based vision, such as optical flow estimation, motion segmentation, or pure rotational motion estimation. However, note that for an arbitrary point to be warped into the reference view, the warping must either be homographic or the parameter vector must contain the depth for each event at the time it was captured. Both are rather restrictive towards general camera motion estimation in arbitrary environments. Current state-of-art contrast maximization methods can, therefore, only handle a particular set of problems such as motion in front of a plane or pure rotation.

3. Volumetric Contrast Maximization Using Ray Warping

Let us now proceed to our main contribution, which consists of extending the idea of contrast maximization into 3D, a technique that will enable us to handle situations in which we perceive non-planar environments under arbitrary motion and with no priors on the depth of events. Our main idea is illustrated in Figure 1. We introduce a continuous-time camera trajectory model as done in Furgale et al. [31], which parametrizes both the position and the orientation of the sensor as a smooth, continuous function of time. For a given event, we may then use its timestamp to extrapolate the position and orientation of the event camera at the time the event was captured. Combined with the normalized spatial direction of the event inside the camera frame, each event can be translated into a spatial ray for which the starting point and orientation depend on the continuous trajectory parameters. Rather than evaluating the density of points for pixels in the image, we then propose to evaluate the density of rays at discrete locations in a volume in front of a reference view. We denote this volumetric density field the Volume of Warped Events (VWE). The intuition is analogous to the IWE: the assumption is that there is a limited number of spatial (appearance or geometric) edges that will cause sufficiently large gradients in the image. Under the optimal motion parameters, the rays of the events will therefore intersect along those spatial edges and cause maximum ray density in those regions. In other words, the optimal motion parameters may be found by maximizing the contrast in the VWE. The important question is again given by how to express the ray density in the VWE.

Figure 1

Volume of Warped Events: Events are transformed into rays that are warped in space based on a continuous time trajectory model. We evaluate the ray density in a volume in front of a reference view and maximize its contrast as a function of the continuous motion parameters.

The structure of the VWE field is inspired by the space-sweeping approach of [17] et al., who propose to estimate 3D structure regardless of explicit data associations and photometric information by finding local maxima in a spatial ray density field. However, their method assumes known camera poses, using an alternative camera tracking scheme in their previous work [29]. To the best of our knowledge, we are the first to propose the maximization of the contrast in the volumetric ray density field and thus implicitly perform joint optimization over the continuous camera trajectory parameters and the 3D structure.

3.1. Continuous Ray Warping

Suppose our event camera is pre-calibrated and camera-to-image as well as image-to-camera transformation functions and are given. The latter transforms image locations into spatial directions in the camera frame by . In terms of the extrinsic, the trajectory of the camera is kept general for now and simply represented by a minimal, time-continuous, smoothly varying 6-vector , where still represents a set of continuous motion parameters, the position of the camera expressed in a world frame, and its orientation as a Rodriguez vector. Note that the dimensionality of is left unspecified for now. However, as will be shown in Section 4, it may indeed have only one or two parameters for certain special types of planar displacements. Besides its inherently smooth property, the continuous-time trajectory model has the obvious ability to be able to register information coming from temporally dense sampling sensors, such as event cameras. The transformation from camera to world at time t is given by . With reference to Figure 1, represents the camera frame at time where a certain event has been captured. The absolute pose of the frame at the time of capturing is given by . Now let be the reference frame in which we define the projective sampling volume for the VWE. The absolute pose of is given by . The relative transformation is finally given as Finally, let represent the unknown depth along the ray. Any point on the ray seen from the reference view can be parametrized by .

3.2. VWE and Spatial Contrast Maximization

We are now going back to our question of how to express the ray density in the VWE. The VWE is defined in a volumetric, projective sampling grid, as illustrated in Figure 2. Let be the center of a voxel. The density of the rays in a voxel is now expressed as a function of the orthogonal distance between the voxel center (expressed in the reference view) and each individual ray. This spatial point-to-line distance is also called the object space distance, and it is given by where we have used the rotation and translation to transform the voxel centre into the camera viewpoint at time , and is the householder matrix to project this point onto the normal plane of the observation direction . An example of object space distances for one voxel is indicated in Figure 2.

Figure 2

Warped rays with object space distances for an example voxel .

Supposing that we have N events, the final VWE is again given in smooth form by applying a Gaussian kernel and summing up the object space distances of every event with respect to the voxel center The standard deviation of the Gaussian kernels is not constant but chosen as a function of the depth of the corresponding voxel from the center of the reference view. The final optimization objective is given by maximizing the variance of the VWE, which expresses the sharpness of the edges reflected in the volumetric density field is the mean value of , the number of voxels in the entire volume, and m, n, and l now iterate through the voxels in the volume. Figure 3 visualizes an example VWE for wrong and correct motion parameters. For correct motion parameters (cf. Figure 3a,b), which is the exact rotation and translation displacement obtained from groundtruth, the density field presents higher values and more contrast than for wrong motion parameters (cf. Figure 3c,d), which we add a large perturbation on both rotation and translation displacement.

Figure 3

Volumetric ray density fields for correct (a,b) and wrong (c,d) motion parameters.

3.3. Global Optimization over Longer Trajectories

We perform global optimization by simultaneously maximizing the contrast in multiple VWEs cast from neighboring reference views. Let be the time instants at which individual VWEs are placed. For simplicity, the time instants are regularly spaced such that . We furthermore define time intervals for each corresponding field , which define the subset of events that will be used for registration in that reference view. More specifically, event is used in if . The overall global optimization objective becomes and is defined as the subset of all the events for which . The global optimization strategy is depicted in Figure 4. Note that may be chosen such that neighboring volumes are overlapping, and may be chosen such that events are considered in more than just a single volumetric density field (i.e., ). These choices guarantee that the implicit graph behind this optimization problem is well connected and effects such as scale propagation take place.

Figure 4

Global optimization over multiple reference volumes. The volumes may have spatial overlap. There is an individual time span associated with each reference volume from which events will be considered (marked by the red, blue and green arrows). The time spans may have temporal overlap. Two events may hence both appear in two distinct density fields, which reinforces scale propagation in the optimization.

4. Application to AGV with a Forward-Facing Event Camera

We evaluate our method on a planar Autonomous Ground Vehicle (AGV) on which we mount a single forward-facing event camera. Many solutions for regular, monocular cameras exist, such as simple relative pose solvers [32] or full visual SLAM frameworks [33]. The application of an event camera promises strong advantages in situations of high motion dynamics or—as shown in this work—challenging illumination conditions. Our motion estimation framework is divided into two sub-parts, a front-end module that initializes motion over shorter segments and a back-end module that refines the estimate over larger-scale sequences. Both will be introduced after a short overview of the framework.

4.1. Framework Overview

The complete Visual Odometry (VO) system is designed based on the above VWE method. There are two main modules in the pipeline. The front-end initialization module groups the events into sufficiently small subsets such that the motion on these subsets can be locally approximated using a simplified first-order constant velocity model. Furthermore, the front-end performs contrast maximization using a single VWE only. After a sufficient number of events and initial relative displacements have been accumulated, our method then proceeds to the back-end optimization part. The latter initializes a larger-scale, smooth, continuous-time trajectory model and executes the multi-volume optimization outlined in (5).

4.2. Front-End Single-Frame Optimization

For the local approximation of the motion, we use a parametrization that is inspired by [34,35]. Based on the assumptions of a driftless, non-holonomic platform and locally constant velocities, the continuous motion of the planar vehicle may be approximated to lie on an arc of a circle to which the heading of the vehicle remains tangential. This motion model is also known as the Ackermann motion model, and the center of the circle is commonly referred to as the Instantaneous Centre of Rotation (ICR). The model has only two degrees of freedom, which largely simplifies the geometry of the problem. It is given by the forward velocity v and the rotational velocity . Using the convention and equations from [35], the relative transformation from a frame at time to a nearby reference frame at time is given by Given that scale is unobservable, we fix the forward velocity v to the configured speed of the vehicle (correct scale propagation is taken into account in the later global optimization scheme). As a result, the local motion initialization scheme over a single volume has only 1-DoF, and the parameter vector becomes . Note furthermore that the original Ackermann model requires the camera to be mounted in the center of the non-steering axis, which—in practice—hardly ever is the case. We, therefore, add the extrinsic calibration parameters and , which transform points back-and-forth between the camera and the vehicle reference frames. The reader is invited to read up [35] to see more foundations of the Ackermann motion model. Note that the variance of the VWE is a function of our unique degree of freedom , and the motion parameters can thus be efficiently solved by local gradient-based optimization methods once a rough initial guess is given.

4.3. Back-End Multi-Frame Optimization

The front-end obviously estimates the motion over short time periods only and furthermore relies on the approximation of locally constant velocities and a circular arc trajectory. We add a global back-end optimization over the entire trajectory, which relies on a more general model representing smooth planar motion. We use a two-dimensional, p-th degree B-spline curve where the stand for the two-dimensional control points that control the shape of the smooth, planar trajectory, and the are the known pth-degree B-spline basis functions. The reader is invited to read up [31,36] to see the foundations of B-splines and example applications. Here we only focus on establishing the link to our smooth camera pose functions used in the optimization objective (5). The parameter vector may be defined as the stacked control points of the spline expressed by . The spline directly models the position in the plane, so we easily obtain . For planar motion, the orientation is given by a pure rotation about the vertical axis, and we furthermore exploit the fact that for driftless non-holonomic vehicles, the heading of the vehicle remains tangential to the trajectory. If the heading of the vehicle is still defined as the y axis, and the z axis points vertically upwards, the orientation of the vehicle is finally given as Note that only the temporal basis functions depend on time and that therefore also is a spline-based function of the same control points. The control point vector is initialized from the approximate trajectory given by the front end using the spline curve approximation given by the automatic knots spacing algorithms (9.68) and (9.69) of [36].

5. Implementation and Validation

In this section, we briefly introduce implementation details of our method and then test our algorithm on multiple both synthetic and real datasets. We assess both the accuracy and quality of the estimated trajectories, as well as the quality of the implicitly modeled 3D structure.

5.1. Implementation Details

We utilize the event back-projection approach proposed in [17] to find the neighboring voxels of a spatial ray efficiently. The details of this algorithm can be found in Section 7.1 of [17]. We furthermore use a simple gradient-ascent scheme to solve our volumetric contrast maximization problems. Especially in (6), the fixation of the forward velocity v leaves the angular velocity as the only unknown parameter, thus making the front-end constraint a univariate problem. Finally, to recover the implicitly modeled 3D structure of the environment, we simply reuse the Event-based Multi-View Stereo (EMVS) method from [17].

5.2. Experiment Setup

To demonstrate the performance of our algorithm, we apply it to both synthetic and real datasets. In the synthetic case, we use large-scale outdoor sequences from the KITTI benchmark [37] and convert the image sequences into event data by using the method of Gehrig et al. [38]. The datasets are fully calibrated and contain images captured by a forward-looking camera mounted on a vehicle driving through a city. Experiments on real data are conducted by collecting several small-scale indoor sequences with a DAVIS346 event camera. The camera is mounted forward-facing on a turtlebot (cf. illustrated in Figure 5). It has a resolution of 346 × 260, and captures RGB images in parallel to the events.

Figure 5

AGV equipped with a forward-facing event camera for vehicle motion estimation.

We compare our approach against traditional camera alternatives. Our current implementation focuses on non-holonomic planar motion, which is why we use the 1-point RANSAC algorithm for Ackermann motion [34] as a solid baseline algorithm for the regular camera alternative. We also let our method compete against an established alternative from the open-source community: ORB-SLAM [33]. Note that we rescale all monocular, scale-invariant results to align as well as possible with groundtruth, which we obtain from the original KITTI datasets or an Opti-track system. It should be noted that a direct comparison against alternative event-based VO/SLAM projects is difficult for several reasons. To date, there are no open-source implementations and we are the first to even evaluate a monocular, event-based pipeline on a popular, established benchmark sequence. Furthermore, as stated in Section III. D of [29] and Section 3.5 of [28], the few existing alternatives either depend strongly on the quality of an initial 3D map (cf. [29]) or suffer from slowly converging depth estimates (cf. [28]). As shown in their experiments, they, therefore, require hovering motion in front of the same scene to provide sufficient time for the mapping back-end to converge. In contrast, our method performs joint optimization of trajectory and structure in near real-time, and thus successfully handles the continuous forward-exploration scenario.

5.3. Experiment on Synthetic Data

To prove the effectiveness of our method—which denote ETAM—, we apply it to synthetic sequences generated from the KITTI benchmark datasets [37]. These datasets represent a fairly normal use case without high motion dynamics or challenging illumination. We use the publicly available tool proposed [38] to convert the regular videos into event streams. We compare our method against two alternatives, which are the state-of-the-art ORB-SLAM algorithm [33] and the classical 1-point Ransac algorithm—denoted 1pt—for planar motion [34]. The evaluation is performed on sequences VO-00 and VO-07. The qualitative performance is illustrated in Figure 6a. All algorithms successfully process the sequences without any gross errors, and our system is slightly less accurate than ORB-SLAM on these high-quality datasets. We furthermore believe that the decrease in performance is mostly explained by the approximate motion model, which ignores the slight pitch angle variations that could result from unevenness of the ground surface. Furthermore, we perform on par with 1pt, which also relies on a non-holonomic planar motion model. To the best of our knowledge, this result is the first to demonstrate a monocular event camera solution that returns comparable results to regular camera alternatives.

Figure 6

Results for both our method and 1pt-RANSAC on long outdoor trajectories (top) and indoor sequences (bottom). The indoor sequences are captured under normal (left) or challenging illumination conditions (right). (a) synthetically generated outdoor sequences. (b) real data indoor sequences.

5.4. Experiment on Real Data

To demonstrate the performance of our algorithm on real data, we collect further sequences with a DAVIS346 event camera. The datasets are captured indoors to simulate different illumination conditions and capture groundtruth via Opti-track. We first apply them to two shorter sequences in which the camera follows an either circular (Circle) or purely translational trajectory (Str). Next, we perform a test on a much longer sequence with a more complex motion (Long2). While the first three sequences are recorded under good illumination conditions, we conclude with another sequence with varying lighting conditions by toggling external illumination while the dataset is recorded (HDR). ORB-SLAM proves to be fragile when applied to our indoor sequences. The images have low resolution and the proximity of the structure as well as fast vehicle rotations furthermore induce large frame-to-frame disparities, ultimately causing ORB-SLAM to break in such forward-exploration scenarios. We therefore assess the performance by a quantitative comparison of relative pose errors between ETAM and 1pt. Results for all sequences are summarized in Table 1. It shows the root-mean-square or median of all deviations between estimated and groundtruth short-term relative rotation and translation displacements. Note that—to minimize the impact of unobservable scale—the error of the relative translation is evaluated by considering only the direction. Furthermore, errors are assessed per time, as it is clear that larger intervals may lead to more drift. We, therefore, employ the unit for both rotational and translational errors. The best performance is always highlighted in bold.

Table 1

Accuracy on different sequences. Unit: .

method	Circular motion
method	Rmse(R)	Median(R)	Rmse(t)	Median(t)
1pt	2.4526	2.3330	0.5427	0.0296
ETAM	1.3275	0.6443	0.2826	0.0322
method	Purely translational motion
method	Rmse(R)	Median(R)	Rmse(t)	Median(t)
1pt	0.6997	0.5300	0.5179	0.0369
ETAM	0.9769	0.6334	0.4637	0.0235
method	Long trajectory
method	Rmse(R)	Median(R)	Rmse(t)	Median(t)
1pt	1.8516	1.5829	0.1675	0.1718
ETAM	1.6901	1.3417	0.1631	0.1703
method	Challenging illumination conditions
method	Rmse(R)	Median(R)	Rmse(t)	Median(t)
1pt	-	-	-	-
ETAM	1.6042	0.9093	0.0686	0.0084

It can be easily observed that ETAM outperforms 1pt on most datasets, and it is able to continuously track entire sequences with high accuracy even as illumination conditions become more challenging. In contrast, regular camera-based visual odometry with 1-point RANSAC fails due to poor contrast or motion blur in dark or varying illumination settings (cf. Figure 7). Due to the forward-facing arrangement, the purely translational displacement, on the other hand, triggers much fewer events than trajectories with rotational displacements, hence the slightly inferior performance for this type of motion. Figure 6b visualizes top views of complete trajectories for both algorithms and groundtruth (denoted gt). The left figure is from the sequence Long2, and the right one is from the sequence HDR. Our event-based method can work robustly in all challenging conditions. We kindly refer the reader to our supplemental video file for further qualitative results of our method.

Figure 7

Challenging illumination conditions. Regular frames suffer from poor contrast (left) when lights are off or motion blur (right) when lights are on, which is caused by inappropriate exposure time under varying illumination conditions. Events, in turn, preserve the visual information of the structure.

5.5. Computational Efficiency

All experiments are conducted on an Intel Core i7 2.4 GHz CPU. The total cumulative processing time for each sequence is summarized in Table 2. It remains below the actual length of each dataset, thus indicating real-time capability.

Table 2

Processing time in seconds (s) for the proposed method.

	Circle	Str	Long1	Long2	HDR
Dataset length	14.0 s	10.4 s	43.3 s	40.4 s	17.9 s
Processing time	8.6 s	8.5 s	35.6 s	24.9 s	13.8 s

5.6. Reconstruction Result

Figure 8 and Figure 9 finally visualize reconstruction results of the indoor scene (cf. video in Supplementary Material).Figure 8b,c show a side perspective and a birds-eye view of the final result. The colored semi-dense points represent the reconstructed structure, while the sparse white points in the center denote the discretized trajectory. As can be clearly observed, our method produces a visually reasonable reconstruction similar to what one would obtain using a sparse or semi-dense method on regular images. Further visualisations of re-projected point clouds overlayed onto real images are visualized in Figure 9a–f. The depth of points is indicated by the color, which reaches from red for closer points to blue for far-away points. Note that we clean up isolated noisy points by applying a radius filter. However, no additional depth fusion strategy is applied.

Figure 8

Reconstruction of an indoor scene. (a) shows a real image of the environment. (b,c) are different perspectives onto the reconstructed structure.

Figure 9

(a–f) are back-projections of the marked structure parts shown in Figure 8 overlaid onto the corresponding images captured under those poses. Warmer colors indicate closer points, while colder colors indicate larger depth.

6. Conclusions

Our main novelty consists of a single, joint objective that optimizes smooth motion directly from events without the need for a prior derivation of 3D structure. This is achieved by constructing a volumetric ray density field, in which we then maximize contrast as a function of smooth motion parameters. As a result, the approach can bootstrap spatial motion in arbitrarily structured environments. The formulation is tested on the important application of ground vehicle motion estimation, and potential advantages under high dynamic motion or challenging illumination conditions are verified. While this is a highly promising result, our next step consists of extending the operation to more dynamic, full 3D motion, which we believe is possible if using the additional input of an IMU.

4 in total

1. Event-Based, 6-DOF Camera Tracking from Photometric Depth Maps.

Authors: Guillermo Gallego; Jon E A Lund; Elias Mueggler; Henri Rebecq; Tobi Delbruck; Davide Scaramuzza
Journal: IEEE Trans Pattern Anal Mach Intell Date: 2017-11-03 Impact factor: 6.226

2. Event-based visual flow.

Authors: Ryad Benosman; Charles Clercq; Xavier Lagorce; Sio-Hoi Ieng; Chiara Bartolozzi
Journal: IEEE Trans Neural Netw Learn Syst Date: 2014-02 Impact factor: 10.451

Review 3. Event-Based Vision: A Survey.

Authors: Guillermo Gallego; Tobi Delbruck; Garrick Orchard; Chiara Bartolozzi; Brian Taba; Andrea Censi; Stefan Leutenegger; Andrew J Davison; Jorg Conradt; Kostas Daniilidis; Davide Scaramuzza
Journal: IEEE Trans Pattern Anal Mach Intell Date: 2021-12-07 Impact factor: 6.226

4 in total