| Literature DB >> 35214532 |
Zhaofeng Niu1, Yuichiro Fujimoto1, Masayuki Kanbara1, Taishi Sawabe1, Hirokazu Kato1.
Abstract
The truncated signed distance function (TSDF) fusion is one of the key operations in the 3D reconstruction process. However, existing TSDF fusion methods usually suffer from the inevitable sensor noises. In this paper, we propose a new TSDF fusion network, named DFusion, to minimize the influences from the two most common sensor noises, i.e., depth noises and pose noises. To the best of our knowledge, this is the first depth fusion for resolving both depth noises and pose noises. DFusion consists of a fusion module, which fuses depth maps together and generates a TSDF volume, as well as the following denoising module, which takes the TSDF volume as the input and removes both depth noises and pose noises. To utilize the 3D structural information of the TSDF volume, 3D convolutional layers are used in the encoder and decoder parts of the denoising module. In addition, a specially-designed loss function is adopted to improve the fusion performance in object and surface regions. The experiments are conducted on a synthetic dataset as well as a real-scene dataset. The results prove that our method outperforms existing methods.Entities:
Keywords: TSDF; depth fusion; sensor noises
Year: 2022 PMID: 35214532 PMCID: PMC8879644 DOI: 10.3390/s22041631
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1Illustration of the sensor noises. (a) Sensor without noises. (b) Depth noises. (c) Sensor pose noises.
Figure 2DFusion can minimize the influence of both types of noises.
Figure 3The DFusion model.
Figure 4The focus regions of the loss functions (green masks for the focus regions). (a) The illustration of the example scene, where one object exists. (b) The scene loss. (c) The object loss. (d) The surface loss.
Comparison results on ShapeNet (with only depth noise).
| Methods | MSE | MAD | ACC | IoU |
|---|---|---|---|---|
| DeepSDF [ | 412.0 | 0.049 | 68.11 | 0.541 |
| OccupacyNetworks [ | 47.5 | 0.016 | 86.38 | 0.509 |
| TSDF Fusion [ | 10.9 | 0.008 | 88.07 | 0.659 |
| RoutedFusion [ | 5.4 | 0.005 | 95.29 | 0.816 |
| DFusion (Ours) |
|
|
|
|
Comparison results on ShapeNet (with depth noise and pose noise).
| Methods | MSE | MAD | ACC | IoU |
|---|---|---|---|---|
| DeepSDF [ | 420.3 | 0.052 | 66.90 | 0.476 |
| OccupacyNetworks [ | 108.6 | 0.037 | 77.34 | 0.453 |
| TSDF Fusion [ | 43.4 | 0.020 | 80.45 | 0.582 |
| RoutedFusion [ | 20.8 | 0.017 | 88.19 | 0.729 |
| DFusion (Ours) |
|
|
|
|
Figure 5Fusion results on the ShapeNet dataset with depth noise added.
Figure 6Fusion results on the ShapeNet dataset with pose noise added.
Quantitative results (MAD) on the CoRBS dataset.
| Methods | Human | Desk | Cabinet | Car |
|---|---|---|---|---|
| KinectFusion [ | 0.015 | 0.005 | 0.009 | 0.009 |
| ICP + RoutedFusion [ | 0.014 | 0.005 | 0.008 | 0.009 |
| ICP + DFusion (Ours) |
|
|
|
|
Figure 7Fusion results on the CoRBS dataset. ICP algorithm [36] is used to obtain the sensor trajectory for RoutedFusion and DFusion.
Variants of the proposed method (with depth noise and pose noise).
| Methods | MSE | MAD | ACC | IoU |
|---|---|---|---|---|
| Without object loss | 8.3 | 0.007 | 92.11 | 0.744 |
| Without surface loss | 7.5 |
| 91.83 | 0.769 |
| Without object&surface loss | 16.3 | 0.015 | 90.87 | 0.740 |
| Original |
|
|
|
|