| Literature DB >> 28025565 |
Pedro J Navarro1, Carlos Fernández2, Raúl Borraz3, Diego Alonso4.
Abstract
This article describes an automated sensor-based system to detect pedestrians in an autonomous vehicle application. Although the vehicle is equipped with a broad set of sensors, the article focuses on the processing of the information generated by a Velodyne HDL-64E LIDAR sensor. The cloud of points generated by the sensor (more than 1 million points per revolution) is processed to detect pedestrians, by selecting cubic shapes and applying machine vision and machine learning algorithms to the XY, XZ, and YZ projections of the points contained in the cube. The work relates an exhaustive analysis of the performance of three different machine learning algorithms: k-Nearest Neighbours (kNN), Naïve Bayes classifier (NBC), and Support Vector Machine (SVM). These algorithms have been trained with 1931 samples. The final performance of the method, measured a real traffic scenery, which contained 16 pedestrians and 469 samples of non-pedestrians, shows sensitivity (81.2%), accuracy (96.2%) and specificity (96.8%).Entities:
Keywords: 3D LIDAR sensor; machine vision and machine learning; pedestrian detection
Mesh:
Year: 2016 PMID: 28025565 PMCID: PMC5298591 DOI: 10.3390/s17010018
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1Main components of the CIC architecture.
Figure 2Range diagram of the sensors mounted on the CIC.
Figure 3Sensor system on-board the car roof (a), and front (b) of the CIC.
Figure 4Actuation system: (a) steering wheel and (b) breaking pedal.
Figure 5Control system: (a) aluminium structure; (b) cRIO 9082; (c) Emmbebed PC; (d) IMU.
Figure 6Pedestrian sample captured with software tools over a frame from the 3D LIDAR.
Figure 7Normalized XY, XZ and YZ axonometric projections of two pedestrian samples: (a–c) pedestrian one; (d–f) pedestrian two. By applying different coefficients to the normalized data, while keeping the aforementioned proportions, we can generate several binary images. But these images are only composed of dots (the points where a laser beam hit an object), see Figure 8a–f.
Figure 8(a–f) binary images generated from the XY, XZ and YZ projections of two pedestrian samples; (g–l) results of the pre-processing of the binary images.
Feature vector composition.
| Shape Features | Invariant Moments | Statistical Features |
|---|---|---|
| f1, f2, f3: Areas of XY, XZ, YZ projections | f22, f23, f24: Hu moment 1 over XY, XZ, YZ projections | f43, f44: Means of distances and reflexivity |
| f4, f5, f6: Perimeters of XY, XZ, YZ projections | f25, f26, f27: Hu moment 2 over XY, XZ, YZ projections | f45, f46: Standard deviations of distances and reflexivity |
| f7, f8, f9: Solidity of XY, XZ, YZ projections | f28, f29, f30: Hu moment 3 over XY, XZ, YZ projections | f47, f48: Kurtosis of distances and reflexivity |
| f10, f11, f12: Equivalent diameters of XY, XZ, YZ projections | f31, f32, f33: Hu moment 4 over XY, XZ, YZ projections | f49, f50: Skewness of distances and reflexivity |
| f13, f14, f15: Eccentricity of XY, XZ, YZ projections | f34, f35, f36: Hu moment 5 over XY, XZ, YZ projections | |
| f16, f17, f18: Length major axis of XY, XZ, YZ projections | f37, f38, f39: Hu moment 6 over XY, XZ, YZ projections | |
| f19, f20, f21: Length minor axis of XY, XZ, YZ projections | f40, f41, f42: Hu moment 7 over XY, XZ, YZ projections |
kNN, NBC and SVM configuration parameters.
| Configuration | kNN | NBC | SVM |
|---|---|---|---|
| Method | Euclidean, Mahalanobis | Gauss (2), KSF (3) | Linear (4), quadratic (5) |
| Data normalisation | Yes (1) | No | Yes (1) |
| Metrics | LOOCV, ROC | LOOCV, ROC | LOOCV, ROC |
| Classes | 2 | 2 | 2 |
(1) Normalised based on ; (2) Kernel Smoothing Function: G(x) = ((2 * )−0.5) * exp(−0.5 * x2); (3) Kernel Smoothing Function: ; (4) Kernel Smoothing Function: G(x1,x2) = x1’x2; (5) Kernel Smoothing Function: G(x1,x2) = (1 + x1’x2)2.
Figure 9Three scene frames: (a) Intelligent Vehicles and Computer Vision Lab; (b) underground parking; (c) Real traffic scene. The colour of the points in the images represents the reflexivity value reported by the 3D LIDAR.
LOOCV error and AUC for kNN, NBC, SVM.
| MLA | kNN | NBC | SVM | ||||
|---|---|---|---|---|---|---|---|
| 0.0653 | 0.0673 | 0.1361 | 0.6769 | ||||
| 0.9916 | 0.9931 | 0.9317 | 0.9764 | ||||
| 0.9764 | 0.9727 | 0.9304 | 0.2122 | 0.9758 | 1.0000 | ||
| 0.9205 | 0.9169 | 0.9891 | 0.9855 | 0.8194 | 1.0000 | ||
| 0.9865 | 0.985907 | 0.9980 | 0.9887 | 0.9699 | 1.0000 | ||
| 0.9684 | 0.964785 | 0.9388 | 0.3231 | 0.9533 | 1.0000 | ||
| 0.9793 | 0.9630 | 0.3494 | 0.9728 | ||||
Figure 10ROC results: (a) kNN classifier with Euclidean distance, NBC with KSF, and SVM classifier with linear functions; (b) kNN classifier with Mahalanobis distance, NBC with Gauss kernel, and SVM classifier with quadratic polynomial function.
TP, FP, TN and FN computed seven frames of scenario 3.
| kNN–Euclidean | SVM–Linear | SVM–Quadratic | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Frame | Samples | Pedestrians in the Sample | TP | FP | TN | FN | TP | FP | TN | FN | TP | FP | TN | FN |
| 1 | 100 | 1 | 1 | 7 | 92 | 0 | 1 | 2 | 97 | 0 | 1 | 14 | 85 | 0 |
| 2 | 53 | 1 | 1 | 5 | 47 | 0 | 1 | 2 | 50 | 0 | 0 | 3 | 49 | 1 |
| 3 | 47 | 1 | 1 | 4 | 42 | 0 | 1 | 1 | 45 | 0 | 1 | 10 | 36 | 0 |
| 4 | 58 | 3 | 1 | 7 | 48 | 2 | 1 | 3 | 52 | 2 | 2 | 6 | 49 | 1 |
| 5 | 79 | 4 | 3 | 5 | 70 | 1 | 3 | 3 | 72 | 1 | 4 | 9 | 66 | 0 |
| 6 | 45 | 4 | 4 | 5 | 36 | 0 | 4 | 1 | 40 | 0 | 4 | 2 | 39 | 0 |
| 7 | 103 | 2 | 2 | 9 | 92 | 0 | 2 | 3 | 98 | 0 | 2 | 6 | 95 | 0 |
| Sum | 485 | 16 | 13 | 42 | 427 | 3 | 13 | 15 | 454 | 3 | 14 | 50 | 419 | 2 |
Scenario metrics.
| Metric | kNN–Euclidean | SVM–Linear | SVM–Quadratic |
|---|---|---|---|
| Sensivity | 0.8125 | 0.8125 | 0.8750 |
| Specificity | 0.9104 | 0.9680 | 0.8934 |
| Precision | 0.2364 | 0.4643 | 0.2188 |
| Accuracy | 0.9072 | 0.9629 | 0.8928 |
| Fscore | 0.3662 | 0.5909 | 0.3500 |
Figure 11Pedestrian detection algorithm on real traffic area.
Comparison of proposed method with other authors.
| Author/Year [Ref.] | MLA | Metric | Performance | |
|---|---|---|---|---|
| Graphical | Numeric | |||
| Proposed 2016 | Linear SVM | ROC curve | LOOCV, AUC, sensitivity, specificity, precision, accuracy, Fscore | 0.0528, |
| Premebida 2014 [ | Deformable Part-based Model (DPM) | Precision-Recall curve of the areas of the pedestrian correctly identified. | Authors report a mean | 0.3950 |
| Spinello 2010 [ | Multiple AdaBoost classifiers | Precision-Recall curve, Equal Error Rates (EER) | Authors report a mean | 0.7760 |
| Navarro-Serment 2010 [ | Two SVMs in cascade | Precision-Recall and ROC curves | AUC estimate | 0.8500 |
| Ogawa 2011 [ | Interacting Multiple Model filter | Recognition rate | Authors report a mean | 0.8000 |
| Kidono 2011 [ | SVM | ROC curve | AUC estimate | 0.9000 |