| Literature DB >> 31027218 |
Xiaochen Qiu1,2, Hai Zhang3,4, Wenxing Fu5, Chenxu Zhao6, Yanqiong Jin7,8.
Abstract
The research field of visual-inertial odometry has entered a mature stage in recent years. However, unneglectable problems still exist. Tradeoffs have to be made between high accuracy and low computation for users. In addition, notation confusion exists in quaternion descriptions of rotation; although not fatal, this may results in unnecessary difficulties in understanding for researchers. In this paper, we develop a visual-inertial odometry which gives consideration to both precision and computation. The proposed algorithm is a filter-based solution that utilizes the framework of the noted multi-state constraint Kalman filter. To dispel notation confusion, we deduced the error state transition equation from scratch, using the more cognitive Hamilton notation of quaternion. We further come up with a fully linear closed-form formulation that is readily implemented. As the filter-based back-end is vulnerable to feature matching outliers, a descriptor-assisted optical flow tracking front-end was developed to cope with the issue. This modification only requires negligible additional computation. In addition, an initialization procedure is implemented, which automatically selects static data to initialize the filter state. Evaluations of proposed methods were done on a public, real-world dataset, and comparisons were made with state-of-the-art solutions. The experimental results show that the proposed solution is comparable in precision and demonstrates higher computation efficiency compared to the state-of-the-art.Entities:
Keywords: closed-form state transition equation; computation saving; quaternion notation; real-time motion tracking; robust feature tracking; visual inertial odometry
Year: 2019 PMID: 31027218 PMCID: PMC6515200 DOI: 10.3390/s19081941
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1Rotation of frame A into frame B.
Figure 2Statistical distribution of ORB descriptor [32] distances for coarsely matched, strictly matched, random constructed, and unmatched Shi-Tomasi feature pairs. The X axis represents descriptor distances and ranges from 0 to 255. The range of the Y axis is determined by the number of feature pairs in each experiment. (a) Coarsely matched features results. (b) Random constructed features results. (c) Strictly matched features results. (d) Unmatched features results.
Figure 3This figure shows the descriptor distances of unmatched and matched feature pairs. It can be clearly seen that the difference is statistically significant, thus a heuristic algorithm can be used to pick out outliers.
Figure 4Flow chart of extended Kalman filter (EKF)-based visual-inertial odometry (VIO) implementation. Red sections highlight novelties proposed in this paper. Term “IMU” stands for inertial measurement unit, and term “MSCKF” stands for multi-state constraint Kalman filter.
Figure 5Results of 4 EuRoC sequences classified as “difficult”. Estimated trajectories are aligned with ground truths by a 6-DOF transformation (without scale). (a) MH_04_difficult. (b) MH_05_difficult. (c) V1_03_difficult. (d) V2_03_difficult.
Figure 6Boxplot summary of experimental results in terms of translation root-mean-square errors (RMSEs) of estimated trajectories. As can be seen, with ORB descriptor assistance the estimation is generally of higher precision, reflected in the lower position and narrower height of the corresponding box’s range for most sequences.
Mean and standard deviation of RMSEs in Figure 6. For each sequence, the one with an obviously better performance is highlighted.
| Sequence | MH_01 | MH_02 | MH_03 | MH_04 | MH_05 | V1_01 | V1_02 | V1_03 | V2_01 | V2_02 | V2_03 | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| mean | std | mean | std | mean | std | mean | std | mean | std | mean | std | mean | std | mean | std | mean | std | mean | std | mean | std | |
| pure optical flow | 0.309 | 0.076 | 0.297 | 0.065 | 0.381 | 0.050 | 0.435 | 0.071 | 0.393 | 0.051 | 0.108 | 0.026 | 0.082 | 0.012 | 0.130 | 0.018 | 0.162 | 0.057 | 0.137 | 0.019 | 0.248 | 0.047 |
| ORB assisted |
|
|
|
|
|
|
|
| 0.391 | 0.046 |
|
| 0.082 | 0.010 | 0.131 | 0.017 |
|
| 0.134 | 0.019 |
|
|
Mean of the processing time (ms) of the proposed ORB descriptor-assisted outlier elimination procedure for every image.
| Sequence | MH_01 | MH_02 | MH_03 | MH_04 | MH_05 | V1_01 | V1_02 | V1_03 | V2_01 | V2_02 | V2_03 |
|---|---|---|---|---|---|---|---|---|---|---|---|
| process time | 1.3942 | 1.6480 | 1.3373 | 1.3983 | 1.0870 | 1.3297 | 1.0410 | 0.9574 | 1.2506 | 1.0525 | 0.7465 |
Comparison results for proposed algorithm and MSCKF-MONO using the EuRoC dataset. The means of positioning RMSEs (m) of 10 runs for both algorithms are calculated.
| MH_01 | MH_02 | MH_03 | MH_04 | MH_05 | V1_01 | V1_02 | V1_03 | V2_01 | V2_02 | V2_03 | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| MSCKF-MONO | 1.015 | 0.534 | 0.427 | 2.102 | 0.968 | 0.169 | 0.275 | 1.551 | 0.281 | 0.341 | × |
| Proposed | 0.299 | 0.280 | 0.342 | 0.350 | 0.384 | 0.096 | 0.078 | 0.132 | 0.121 | 0.137 | 0.224 |
Results of proposed and state-of-the-art VIOs using EuRoC dataset. Ten runs on each sequence and the means of positioning RMSEs (m) are calculated.
| MH_01 | MH_02 | MH_03 | MH_04 | MH_05 | V1_01 | V1_02 | V1_03 | V2_01 | V2_02 | V2_03 | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| VINS-MONO |
|
|
|
|
| 0.090 | 0.110 | 0.188 |
| 0.163 | 0.305 |
| ROVIO | 0.250 | 0.653 | 0.449 | 1.007 | 1.448 | 0.159 | 0.198 | 0.172 | 0.299 | 0.642 |
|
| OKVIS | 0.376 | 0.378 | 0.277 | 0.323 | 0.451 |
| 0.157 | 0.224 | 0.132 | 0.185 | 0.305 |
| Proposed | 0.289 | 0.258 | 0.331 | 0.394 | 0.423 | 0.117 |
|
| 0.097 |
| 0.211 |
Average processing time (ms) and rate (Hz) of visual front-end and EKF/optimization back-end of our implementation and the state-of-the-art using the EuRoC dataset.
| Sequence | MH_01 | MH_02 | MH_03 | MH_04 | MH_05 | V1_01 | V1_02 | V1_03 | V2_01 | V2_02 | V2_03 | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Time | Rate | Time | Rate | Time | Rate | Time | Rate | Time | Rate | Time | Rate | Time | Rate | Time | Rate | Time | Rate | Time | Rate | Time | Rate | ||
| VINS-MONO | front-end | 18.0 | 55 | 18.3 | 55 | 18.6 | 54 | 19.3 | 52 | 21.3 | 47 | 20.2 | 49 | 21.4 | 47 | 23.2 | 43 | 22.3 | 45 | 23.8 | 42 | 30.6 | 33 |
| back-end | 50.2 | 20 | 50.9 | 20 | 50.1 | 20 | 50.1 | 20 | 53.0 | 19 | 53.1 | 19 | 45.9 | 22 | 37.9 | 26 | 54.4 | 18 | 48.3 | 21 | 33.4 | 30 | |
| ROVIO | front-end | 2.0 | 505 | 1.9 | 526 | 2.0 | 497 | 2.1 | 476 | 2.0 | 490 | 1.9 | 538 | 2.0 | 508 | 2.1 | 481 | 2.0 | 503 | 2.0 | 510 | 2.0 | 478 |
| back-end | 15.9 | 63 | 15.9 | 63 | 15.9 | 63 | 15.9 | 63 | 15.7 | 63 | 15.9 | 63 | 15.9 | 63 | 15.9 | 63 | 15.9 | 63 | 15.9 | 63 | 15.9 | 63 | |
| OKVIS | front-end | 46.7 | 21 | 45.3 | 22 | 47.4 | 21 | 40.9 | 24 | 41.4 | 24 | 38.5 | 26 | 38.8 | 26 | 31.3 | 32 | 38.8 | 26 | 37.3 | 27 | 31.4 | 32 |
| back-end | 39.8 | 25 | 39.4 | 25 | 39.9 | 25 | 32.1 | 31 | 33.1 | 30 | 30.6 | 33 | 25.5 | 39 | 19.2 | 52 | 29.6 | 34 | 27.9 | 36 | 18.0 | 56 | |
| Proposed | front-end | 16.2 | 62 | 16.5 | 61 | 15.9 | 63 | 16.1 | 62 | 15.7 | 64 | 15.7 | 64 | 15.3 | 65 | 16.4 | 61 | 15.8 | 63 | 15.9 | 63 | 17.3 | 58 |
| back-end | 5.5 | 182 | 5.9 | 169 | 6.1 | 164 | 5.5 | 181 | 6.0 | 166 | 5.7 | 174 | 5.4 | 185 | 4.9 | 203 | 5.7 | 176 | 5.6 | 178 | 4.6 | 218 | |