| Literature DB >> 35214569 |
Esraa Khatab1,2, Ahmed Onsy2, Ahmed Abouelfarag1.
Abstract
One of the primary tasks undertaken by autonomous vehicles (AVs) is object detection, which comes ahead of object tracking, trajectory estimation, and collision avoidance. Vulnerable road objects (e.g., pedestrians, cyclists, etc.) pose a greater challenge to the reliability of object detection operations due to their continuously changing behavior. The majority of commercially available AVs, and research into them, depends on employing expensive sensors. However, this hinders the development of further research on the operations of AVs. In this paper, therefore, we focus on the use of a lower-cost single-beam LiDAR in addition to a monocular camera to achieve multiple 3D vulnerable object detection in real driving scenarios, all the while maintaining real-time performance. This research also addresses the problems faced during object detection, such as the complex interaction between objects where occlusion and truncation occur, and the dynamic changes in the perspective and scale of bounding boxes. The video-processing module works upon a deep-learning detector (YOLOv3), while the LiDAR measurements are pre-processed and grouped into clusters. The output of the proposed system is objects classification and localization by having bounding boxes accompanied by a third depth dimension acquired by the LiDAR. Real-time tests show that the system can efficiently detect the 3D location of vulnerable objects in real-time scenarios.Entities:
Keywords: 2D LiDAR; autonomous driving; multiple object detection; sensor fusion
Mesh:
Year: 2022 PMID: 35214569 PMCID: PMC8874666 DOI: 10.3390/s22041663
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Three-dimensional object detection networks and frameworks.
| Paper | Modality | Limitation |
|---|---|---|
| Multi-task multi-sensor fusion for 3D object detection [ | RGB + 3D point cloud | Expensive 3D LiDAR |
| Frustum pointnets for 3D Object Detection from rgb-d data [ | RGB-D | 0.12 s per frame |
| Pointfusion: deep sensor fusion for 3D bounding box estimation [ | RGB + 3D point cloud | 1.3 s per frame |
| RoarNet: a robust 3D object detection based on regiOn approximation refinement [ | RGB + 3D point cloud | Expensive 3D LiDAR |
| A frustum-based probabilistic framework for 3D object detection by fusion of LiDAR and camera data [ | RGB + 3D point cloud | Only for detecting static object |
| SEG-VoxelNet for 3D vehicle detection from RGB and LiDAR data [ | RGB + 3D point cloud | Only detects vehicles |
| MVX-Net: multimodal voxelnet for 3D object detection | RGB + 3D point cloud | Not real-time |
| 3D-cvf: generating joint camera and lidar features using cross-view spatial feature fusion for 3D object detection | RGB + 3D point cloud | NVIDIA GTX 1080Ti, inference time 75 ms per frame (13.33 FPS) |
| PI-RCNN: an efficient multi-sensor 3D object detector with point-based attentive cont-conv fusion module | RGB + 3D point cloud | Not real-time |
| Image guidance-based 3D vehicle detection in traffic scene | RGB + 3D point cloud | Only vehicles, 4FPS |
| Epnet: enhancing point features with image semantics for 3D object detection. | RGB + 3D point cloud | Not real-time |
Figure 1Hokuyo UTM 30LX.
Figure 2Hokuyo UTM 30LX angular range.
Figure 3Conversion of pixel values into real-world angular coordinates.
Figure 4Block diagram for the system.
Figure 5Different overlapping scenarios. (a) Object ‘x’ is fully in front of object ‘y’; (b) Object ‘x’ is partially in front of object ‘y’; (c) Object ‘x’ is partially behind object ‘y’.
Figure 6Flowchart for calculating the correct depth measurements for detected objects during overlapping objects.
Mean Average Precision (MAP) of testing YOLOv3 on the KITTI dataset.
| Benchmark | Easy | Moderate | Hard |
|---|---|---|---|
| Car | 56% | 36.23% | 29.55% |
| Pedestrian | 29.98% | 22.84% | 22.21% |
| Cyclist | 9.09% | 9.09% | 9.09% |
Figure 7(a) A sample of LiDAR measurements on a rough surface pre-smoothing. (b) Same LiDAR measurements post-smoothing.
Figure 8A sample of LiDAR measurements when scanning two overlapping objects.