| Literature DB >> 35214254 |
Elena Camuffo1, Daniele Mari1, Simone Milani1.
Abstract
Recent advancements in self-driving cars, robotics, and remote sensing have widened the range of applications for 3D Point Cloud (PC) data. This data format poses several new issues concerning noise levels, sparsity, and required storage space; as a result, many recent works address PC problems using Deep Learning (DL) solutions thanks to their capability to automatically extract features and achieve high performances. Such evolution has also changed the structure of processing chains and posed new problems to both academic and industrial researchers. The aim of this paper is to provide a comprehensive overview of the latest state-of-the-art DL approaches for the most crucial PC processing operations, i.e., semantic scene understanding, compression, and completion. With respect to the existing reviews, the work proposes a new taxonomical classification of the approaches, taking into account the characteristics of the acquisition set up, the peculiarities of the acquired PC data, the presence of side information (depending on the adopted dataset), the data formatting, and the characteristics of the DL architectures. This organization allows one to better comprehend some final performance comparisons on common test sets and cast a light on the future research trends.Entities:
Keywords: completion; compression; deep learning; point cloud; scene understanding; semantic segmentation
Mesh:
Year: 2022 PMID: 35214254 PMCID: PMC8963024 DOI: 10.3390/s22041357
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1Schematic representation of the paper organization.
Structure of the paper and related references.
| Sections | Related References |
|---|---|
| (1) Introduction | [ |
| (2) Point Clouds as Data Structures | |
| (2.1) Point Cloud Data | [ |
| (2.2) Acquisition Systems | [ |
| (2.3) Other Data Structures | [ |
| (3) Datasets | |
| (3.1) ShapeNet | [ |
| (3.2) ModelNet | [ |
| (3.3) MPEG | [ |
| (3.4) 8i Voxelized Full Bodies | [ |
| (3.5) Stanford 3D Indoor Scene Dataset | [ |
| (3.6) KITTI | [ |
| (3.7) SemanticKITTI | [ |
| (3.8) SynthCity | [ |
| (3.9) Other Recent LiDAR Datasets for Automotive Applications | [ |
| (4) General Purpose Deep Learning Techniques | |
| (4.1) Architectures | [ |
| (4.2) Losses | [ |
| (5) Semantic Scene Understanding | |
| (5.1) Disambiguation | |
| (5.2) Discretization-Based Models | [ |
| (5.3) Projection-Based Models | [ |
| (5.4) Point Clouds-Based Models | [ |
| (5.5) Graph-Based Methods | [ |
| (5.6) Transformer-Based Methods | [ |
| (5.7) Performance Comparison between Different Approaches | |
| (6) Compression | [ |
| (6.1) Point-Set Autoencoders | [ |
| (6.2) Convolutional Autoencoders | [ |
| (7) Point Cloud Completion | [ |
| (7.1) Point Completion Network | [ |
| (7.2) Point Fractal Network | [ |
| (7.3) 3D Point Capsules Networks | [ |
| (7.4) GRNet | [ |
| (7.5) Other strategies | [ |
| (8) Conclusions | [ |
Figure 2The Stanford Bunny [38] model in different three-dimensional representations. (a) Point Cloud, (b) Voxels, (c) Octree, (d) Mesh, (e) Depth.
Figure 3Samples from some of the presented datasets. (a) ShapeNet, (b) MPEG, (c) 8iVFB, (d) S3DIS, (e) SemanticKITTI, (f) SynthCity.
Figure 4PointNet [49] model.
Figure 5Taxonomy of the main methods for PC Semantic Scene Understanding.
Figure 6SegCloud [54] pipeline.
Figure 7RandLA-Net [60] basic module.
Figure 8Different types of point convolution [32].
Figure 9PointCNN [76] -Conv operator.
Figure 10Cylinder3D [77] space partition.
Figure 11PSTNet [78] sequence encoding.
Figure 12Graph-based networks principle [32].
Figure 13Point cloud transformer network architecture [83].
Comparison of some of the main approaches for Point Cloud Semantic Segmentation in terms of mIoU percentage, on SemanticKITTI [35], S3DIS [44], and Semantic3D [33] datasets.
| Year | SemanticKITTI | S3DIS | Semantic3D | Category | |
|---|---|---|---|---|---|
| PointNet [ | 2017 | 14.6 | - | - | PC (MLP) |
| PointNet++ [ | 2017 | 20.1 | - | - | PC (MLP) |
| SegCloud [ | 2017 | - | - | 61.3 | Disc (D) |
| SnapNet [ | 2018 | - | - | 59.1 | Proj (MV) |
| SqueezeSeg [ | 2018 | 29.5 | - | - | Proj (Sph) |
| SPGraph [ | 2018 | 17.4 | 62.1 | 76.2 | Graph |
| SPLATNet [ | 2018 | 18.4 | - | - | Disc (S) |
| SqueezeSegV2 [ | 2019 | 39.7 | - | - | Proj (Sph) |
| LatticeNet [ | 2019 | 52.9 | - | - | Disc (S) |
| KPConv [ | 2019 | - | 70.6 | 74.6 | PC (C-Conv) |
| RangeNet++ [ | 2019 | 52.2 | - | - | Proj (Sph) |
| RandLA-Net [ | 2019 | 53.9 | 70.0 |
| PC (MLP) |
| PolarNet [ | 2020 | 57.2 | - | - | Proj (Cyl) |
| Cylinder3D [ | 2020 | 68.9 | - | - | Proj (Cyl) + PC (D-Conv) |
| PTC [ | 2021 | - |
| - | Transformer |
| (AF) | 2021 |
| - | - | Disc (D) + PC (C-Conv) |
Figure 14Graphical comparison of the main approaches for Point Cloud Semantic Segmentation in terms of mIoU percentage, on SemanticKITTI [35], S3DIS [44], and Semantic3D [33] datasets.
Figure 15Examples of PCs and their reconstruction procedure.
Figure 16Scheme of the architecture proposed in [106].
Figure 17Architecture proposed in [107].
Figure 18Architecture proposed in [108].
Figure 19Bjontegard metrics for Wang et al. [107], Milani [106], and Guarda et al. [110] against TMC13.
Figure 20Point Completion Network [128] architecture.
Figure 21Point Fractal Network [129] architecture.
Figure 223D Point Capsule Network [130] architecture.