| Literature DB >> 34883937 |
António Silva1, Duarte Fernandes1, Rafael Névoa1, João Monteiro1, Paulo Novais1, Pedro Girão2, Tiago Afonso2, Pedro Melo-Pinto1,3.
Abstract
Research about deep learning applied in object detection tasks in LiDAR data has been massively widespread in recent years, achieving notable developments, namely in improving precision and inference speed performances. These improvements have been facilitated by powerful GPU servers, taking advantage of their capacity to train the networks in reasonable periods and their parallel architecture that allows for high performance and real-time inference. However, these features are limited in autonomous driving due to space, power capacity, and inference time constraints, and onboard devices are not as powerful as their counterparts used for training. This paper investigates the use of a deep learning-based method in edge devices for onboard real-time inference that is power-effective and low in terms of space-constrained demand. A methodology is proposed for deploying high-end GPU-specific models in edge devices for onboard inference, consisting of a two-folder flow: study model hyperparameters' implications in meeting application requirements; and compression of the network for meeting the board resource limitations. A hybrid FPGA-CPU board is proposed as an effective onboard inference solution by comparing its performance in the KITTI dataset with computer performances. The achieved accuracy is comparable to the PC-based deep learning method with a plus that it is more effective for real-time inference, power limited and space-constrained purposes.Entities:
Keywords: 3D object detection; LiDAR scanners; autonomous driving; deep learning methods; onboard inference; quantisation methods
Mesh:
Year: 2021 PMID: 34883937 PMCID: PMC8659874 DOI: 10.3390/s21237933
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1Traditional architecture of hybrid FPGA-CPU based inference solutions [1].
Figure 2Methodology used for the deployment of the object detection model in the hardware device.
Figure 3Object Detection Network pipeline.
Detection Head filters and upsample filters configurations.
|
| Lighter | Intermediate | Baseline | Higher | |
|---|---|---|---|---|---|
| Number | ( | ( | ( | ( | |
|
|
| 32 | 32 | 64 | 128 |
|
| 32 | 64 | 128 | 128 | |
|
| 64 | 128 | 256 | 256 | |
|
|
| 64 | 64 | 128 | 256 |
|
| 64 | 64 | 128 | 256 | |
|
| 64 | 64 | 128 | 256 |
Number of sampling instances (SI) per class.
| Car | Pedestrian | Cyclist | |
|---|---|---|---|
|
| 15 | 6 | 8 |
|
| 15 | 15 | 15 |
|
| 15 | 25 | 25 |
The different point cloud ranges () configurations used in fine-tuning.
|
|
|
|
|
|
| |
|---|---|---|---|---|---|---|
|
| 0 | 69.12 | −39.68 | 39.68 | −3 | 1 |
|
| 0 | 70 | −40 | 40 | −2.5 | 1 |
|
| 0 | 52.8 | −32 | 32 | −3 | 1 |
|
| 0 | 47.36 | −19.84 | 19.84 | −2.5 | 0.5 |
Pillar size () configurations used in fine-tuning.
|
|
| |
|---|---|---|
|
| 0.16 | 0.16 |
|
| 0.25 | 0.25 |
|
| 0.05 | 0.05 |
Total number of Pillars used in fine-tuning.
| Total Number of Pillars | Max Number of Points Per Pillar | |
|---|---|---|
|
| 12 K | 100 |
|
| 30 K | 5 |
The set of experiments conducted and respective network configurations.
| Experiments | Detection Head Config. | No. Output Classes | No. Epochs | ||||
|---|---|---|---|---|---|---|---|
| 1 |
|
|
| 3 |
|
| 160 |
| 2 |
|
|
| 3 |
|
| 160 |
| 3 |
|
|
| 3 |
|
| 160 |
| 4 |
|
|
| 3 |
|
| 160 |
| 5 |
|
|
| 3 |
|
| 160 |
| 6 |
|
|
| 3 |
|
| 160 |
| 7 |
|
|
| 3 |
|
| 300 |
| 8 |
|
|
| 3 |
|
| 160 |
| 9 |
|
|
| 3 |
|
| 160 |
| 10 |
|
|
| 3 |
|
| 300 |
| 11 |
|
|
| 3 |
|
| 300 |
| PointPillars |
|
| 1 |
|
| 160 | |
| PointPillars |
|
|
| 2 |
|
| 160 |
Results in validation set for BEV detection metric.
| Model/Experiment | Car | Cyclist | Pedestrian | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Easy | Mod. | Hard | Easy | Mod. | Hard | Easy | Mod. | Hard | |
| Experiment 1 | 90.15 | 82.78 | 82.7 | 78.41 | 63.86 | 59.53 | 57.18 | 52.08 | 47.07 |
| Experiment 2 | 89.8 | 87.3 | 84.95 | 82.98 | 67.62 | 63.48 | 65.11 | 59.94 | 54.97 |
| Experiment 3 | 89.71 | 87.27 | 85.11 | 82.28 | 64.81 | 60.54 | 61.01 | 55.28 | 50.13 |
| Experiment 4 | 89.46 | 86.70 | 85.26 | 85.09 | 68.08 | 63.93 | 63.11 | 58.05 | 53.88 |
| Experiment 5 | 89.23 | 86.52 | 84.49 | 69.13 | 53.47 | 49.57 | 65.29 | 58.85 | 53.05 |
| Experiment 6 | 89.09 | 86.22 | 82.29 | 83.01 | 68.06 | 64.46 | 63.69 | 57.00 | 52.59 |
| Experiment 7 | 89.94 | 87.26 | 85.56 | 82.85 | 66.85 | 62.60 | 62.93 | 57.13 | 53.11 |
| Experiment 8 | 90.02 | 87.23 | 83.22 | 82.63 | 66.85 | 62.51 | 62.24 | 56.78 | 52.76 |
| Experiment 9 | 89.80 | 76.69 | 68.30 | 78.70 | 59.36 | 58.16 | 66.75 | 59.63 | 52.55 |
| Experiment 10 | 89.93 | 87.18 | 84.2 | 85.85 | 67.15 | 63.88 | 62.74 | 57.12 | 52.08 |
| Experiment 11 | 90.02 | 87.65 | 85.83 | 83.25 | 66.85 | 62.25 | 59.83 | 54.37 | 50.30 |
| PointPillars | 89.74 | 86.05 | 81.65 | 82.47 | 62.79 | 59.52 | 68.23 | 63.58 | 59.83 |
Results in validation set for 3D Bounding Box detection metric.
| Model/Experiment | Car | Cyclist | Pedestrian | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Easy | Mod. | Hard | Easy | Mod. | Hard | Easy | Mod. | Hard | |
| Experiment 1 | 83.06 | 71.26 | 67.38 | 77.31 | 60.07 | 57.11 | 50.03 | 44.86 | 40.07 |
| Experiment 2 | 85.62 | 76.41 | 71.98 | 79.54 | 64.28 | 60.87 | 57.04 | 52.73 | 47.88 |
| Experiment 3 | 84.83 | 75.42 | 70.60 | 80.89 | 62.88 | 59.10 | 53.13 | 48.03 | 43.35 |
| Experiment 4 | 84.47 | 76.51 | 73.28 | 83.36 | 64.68 | 61.41 | 54.81 | 49.71 | 45.6 |
| Experiment 5 | 76.37 | 65.94 | 65.06 | 47.47 | 26.09 | 24.68 | 51.35 | 46.39 | 41.26 |
| Experiment 6 | 80.25 | 73.66 | 70.13 | 80.34 | 63.69 | 60.25 | 56.28 | 50.96 | 46.36 |
| Experiment 7 | 81.89 | 75.65 | 71.06 | 81.71 | 62.45 | 59.71 | 55.11 | 49.22 | 45.28 |
| Experiment 8 | 81.35 | 74.88 | 69.28 | 79.08 | 62.43 | 59.45 | 51.05 | 46.45 | 42.94 |
| Experiment 9 | 84.33 | 66.49 | 59.73 | 76.76 | 57.58 | 53.83 | 51.13 | 48.43 | 43.05 |
| Experiment 10 | 87.12 | 77.04 | 74.33 | 84.47 | 63.86 | 61.73 | 55.65 | 50.42 | 45.81 |
| Experiment 11 | 85.22 | 75.49 | 70.64 | 80.55 | 63.07 | 59.31 | 52.70 | 47.19 | 42.72 |
| PointPillars | 83.58 | 74.15 | 68.76 | 80.61 | 60.95 | 56.94 | 62.3 | 57.53 | 52.51 |
Results in validation set for AOS detection metric.
| Model/Experiment | Car | Cyclist | Pedestrian | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Easy | Mod. | Hard | Easy | Mod. | Hard | Easy | Mod. | Hard | |
| Experiment 1 | 90.25 | 86.14 | 80.96 | 75.28 | 61.32 | 57.87 | 30.8 | 28.55 | 26.54 |
| Experiment 2 | 90.59 | 88.44 | 86.62 | 84.57 | 70.33 | 68.1 | 48.42 | 46.03 | 42.65 |
| Experiment 3 | 90.29 | 87.91 | 85.87 | 79.29 | 62.86 | 59.64 | 33.07 | 30.9 | 28.94 |
| Experiment 4 | 90.37 | 88.38 | 87.2 | 86.03 | 68.26 | 64.9 | 46.84 | 44.31 | 41.82 |
| Experiment 5 | 89.72 | 81.09 | 80.61 | 51.93 | 29.7 | 28.15 | 35.95 | 34.01 | 31.46 |
| Experiment 6 | 90.17 | 87.90 | 85.99 | 83.78 | 71.32 | 68.74 | 47.44 | 43.68 | 40.86 |
| Experiment 7 | 90.38 | 87.81 | 86.23 | 82.21 | 65.31 | 62.38 | 54.07 | 50.97 | 47.98 |
| Experiment 8 | 90.51 | 88.31 | 86.17 | 83.6 | 68.12 | 64.55 | 42.18 | 39.42 | 36.87 |
| Experiment 9 | 90.16 | 79.03 | 68.88 | 78.17 | 58.55 | 57.83 | 40.34 | 37.9 | 36.42 |
| Experiment 10 | 90.39 | 88.64 | 86.9 | 85.85 | 66.78 | 64.53 | 54.31 | 50.84 | 47.7 |
| Experiment 11 | 90.46 | 88.20 | 85.99 | 83.37 | 67.19 | 63.76 | 46.85 | 44.32 | 41.66 |
| PointPillars | 90.51 | 88.2 | 86.01 | 82.23 | 62.48 | 59.28 | 34.27 | 33.2 | 31.95 |
Inference time metric benchmark results.
| Model/Experiment | Total (ms) ≈ | Speed (Hz) ≈ |
|---|---|---|
| Experiment 1 | 75.543 | 13.237 |
| Experiment 2 | 94.488 | 10.583 |
| Experiment 3 | 83.162 | 12.025 |
| Experiment 4 | 127.666 | 7.833 |
| Experiment 5 | 88.31 | 11.324 |
| Experiment 6 | 82.063 | 12.186 |
| Experiment 7 | 89.905 | 11.123 |
| Experiment 8 | 33.453 | 29.893 |
| Experiment 9 | 499.908 | 2.000 |
| Experiment 10 | 104.1 | 9.606 |
| Experiment 11 | 79.687 | 12.549 |
| PointPillars | 23.294 | 42.929 |
| PoinPillars | 27.48 | 36.390 |
Figure 4Hardware and Software implementation flow for inference.
Figure 5Hardware Design of DPU connected to the PS side.
Figure 6Model adaptation and optimisation based on post-training quantisation for on-board inference.
Figure 7RPN configurations structure overview.
The set of network configurations used.
| Config. | Det. Head Config. | No. Out. Class. | No. Epochs | ||||
|---|---|---|---|---|---|---|---|
| 1 |
|
|
| 3 |
|
| 300 |
| 2 |
|
|
| 3 |
|
| 300 |
| 3 |
|
|
| 3 |
|
| 300 |
Inference time metric benchmark results, given in Hz, for floating-point and quantised models running in different machines.
| Configuration | Floating-Point (Server) | PTQ (Server) | PTQ (Edge Device) | QAR (Edge Device) |
|---|---|---|---|---|
| 1 | 21.2 | 23.5 | 9.5 | 9.4 |
| 2 | 23.2 | 26.1 | 16.6 | 16.7 |
| 3 | 28.2 | 26.0 | 18.7 | 18.7 |
Results of the floating-point and quantised models on KITTI BEV detection for classes Car (IoU 0.70), Cyclist (IoU 0.50), and Pedestrian (IoU 0.50).
| Config. | Version | Car | Cyclist | Pedestrain | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Easy | Mod. | Hard | Easy | Mod. | Hard | Easy | Mod. | Hard | ||
| 1 | Float P. | 89.64 | 87.30 | 79.42 | 78.44 | 60.72 | 55.31 | 52.98 | 50.75 | 44.74 |
| PTQ | 89.44 | 86.62 | 79.04 | 73.07 | 53.81 | 51.99 | 49.12 | 43.46 | 42.19 | |
| QAT | 89.76 | 87.50 | 79.48 | 81.32 | 63.02 | 56.93 | 56.93 | 50.91 | 44.70 | |
| 2 | Float P. | 89.75 | 87.22 | 79.35 | 72.49 | 53.75 | 52.14 | 59.98 | 53.75 | 52.14 |
| PTQ | 88.82 | 85.51 | 77.93 | 67.14 | 52.27 | 47.66 | 48.91 | 46.61 | 41.02 | |
| QAT | 89.72 | 86.93 | 79.07 | 70.93 | 58.08 | 52.65 | 51.28 | 45.36 | 43.39 | |
|
| 89.72 | 86.93 | 79.15 | 73.97 | 61.00 | 56.38 | 54.22 | 49.16 | 46.34 | |
| 3 | Float P. | 89.85 | 86.94 | 79.24 | 71.13 | 51.80 | 49.85 | 51.45 | 45.13 | 38.74 |
| PTQ | 89.18 | 79.1 | 78.24 | 69.65 | 49.87 | 44.72 | 50.27 | 43.79 | 43.26 | |
| QAT | 89.85 | 86.94 | 79.24 | 70.24 | 52.13 | 47.61 | 50.43 | 44.89 | 42.49 | |
Results of the floating-point and quantised models on KITTI 3D BBOX detection.
| Config. | Version | Car | Cyclist | Pedestrain | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Easy | Mod. | Hard | Easy | Mod. | Hard | Easy | Mod. | Hard | ||
| 1 | Float P. | 84.3 | 74.82 | 67.74 | 73.04 | 60.06 | 54.66 | 47.51 | 42.07 | 36.22 |
| PTQ | 82.11 | 66.72 | 65.31 | 68.43 | 49.3 | 45.22 | 41.32 | 38.9 | 33.7 | |
| QAT | 85.09 | 75.46 | 68.19 | 72.22 | 55.53 | 54.52 | 48.86 | 46.91 | 41.05 | |
| 2 | Float P. | 83.68 | 68.24 | 66.46 | 71.28 | 52.26 | 50.68 | 46.26 | 40.34 | 34.90 |
| PTQ | 73.97 | 64.27 | 62.31 | 63.55 | 46.8 | 45.13 | 43.67 | 38.04 | 33.65 | |
| QAT | 78.57 | 68.14 | 66.23 | 69.18 | 56.41 | 51.29 | 46.32 | 40.89 | 35.09 | |
|
| 83.39 | 73.10 | 66.28 | 71.62 | 56.72 | 54.19 | 45.78 | 43.40 | 37.75 | |
| 3 | Float P. | 77.44 | 67.36 | 65.78 | 74.87 | 53.00 | 51.25 | 44.61 | 39.36 | 34.26 |
| PTQ | 72.56 | 62.31 | 55.49 | 64.83 | 47.94 | 43.05 | 37.29 | 32.92 | 32.3 | |
| QAT | 77.44 | 67.36 | 65.78 | 74.87 | 53 | 51.25 | 44.61 | 39.36 | 34.26 | |
Results of the floating-point and quantised models on KITTI AOS detection.
| Config. | Version | Car | Cyclist | Pedestrain | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Easy | Mod. | Hard | Easy | Mod. | Hard | Easy | Mod. | Hard | ||
| 1 | Float P. | 90.68 | 88.29 | 79.85 | 76.5 | 60.21 | 54.7 | 32.63 | 29.08 | 28.81 |
| PTQ | 89.97 | 86.86 | 78.76 | 68.1 | 51.47 | 50.15 | 28.1 | 26.05 | 26.17 | |
| QAT | 90.59 | 88.51 | 79.95 | 78.52 | 61.37 | 55.31 | 34.18 | 34.06 | 30.45 | |
| 2 | Float P. | 90.48 | 87.70 | 79.50 | 68.44 | 56.08 | 50.78 | 27.35 | 25.26 | 22.51 |
| PTQ | 89.23 | 85.7 | 77.5 | 66.67 | 53.08 | 48.78 | 24.87 | 22.12 | 21.2 | |
| QAT | 90.49 | 87.41 | 79.35 | 70.49 | 55.63 | 54.2 | 30.16 | 28 | 27.47 | |
|
| 90.50 | 87.35 | 86.05 | 69.90 | 58.29 | 54.26 | 31.45 | 29.95 | 29.17 | |
| 3 | Float P. | 90.42 | 87.52 | 79.38 | 71.85 | 54.27 | 50.7 | 31.2 | 28.21 | 26.23 |
| PTQ | 90.24 | 86.71 | 78.51 | 65.51 | 48.83 | 46.83 | 21.45 | 21.09 | 17.47 | |
| QAT | 90.42 | 87.46 | 79.4 | 72.95 | 55.27 | 50.7 | 29.8 | 27.06 | 26.55 | |
Figure 8Representation of the location of the KITTI evaluation dataset’s ground truths and models predictions for all evaluation point clouds on a BEV perspective and according to the camera coordinate frame.
Figure 9Example of a KITTI frame inference results for PTQ (left point cloud top view) and Optimised QAR (right image) configuration 2 model.