| Literature DB >> 31635231 |
Ruikang Liu1, Qing Han2, Weidong Min3,4, Linghua Zhou5, Jianqiang Xu6.
Abstract
Vehicle Logo Recognition (VLR) is an important part of vehicle behavior analysis and can provide supplementary information for vehicle identification, which is an essential research topic in robotic systems. However, the inaccurate extraction of vehicle logo candidate regions will affect the accuracy of logo recognition. Additionally, the existing methods have low recognition rate for most small vehicle logos and poor performance under complicated environments. A VLR method based on enhanced matching, constrained region extraction and SSFPD network is proposed in this paper to solve the aforementioned problems. A constrained region extraction method based on segmentation of the car head and car tail is proposed to accurately extract the candidate region of logo. An enhanced matching method is proposed to improve the detection performance of small objects, which augment each of training images by copy-pasting small objects many times in the unconstrained region. A single deep neural network based on a reduced ResNeXt model and Feature Pyramid Networks is proposed in this paper, which is named as Single Shot Feature Pyramid Detector (SSFPD). The SSFPD uses the reduced ResNeXt to improve classification performance of the network and retain more detailed information for small-sized vehicle logo detection. Additionally, it uses the Feature Pyramid Networks module to bring in more semantic context information to build several high-level semantic feature maps, which effectively improves recognition performance. Extensive evaluations have been made on self-collected and public vehicle logo datasets. The proposed method achieved 93.79% accuracy on the Common Vehicle Logos Dataset and 99.52% accuracy on another public dataset, respectively, outperforming the existing methods.Entities:
Keywords: SSFPD; constrained region; enhanced matching; robotic systems; vehicle logo recognition
Year: 2019 PMID: 31635231 PMCID: PMC6832326 DOI: 10.3390/s19204528
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1Framework of proposed method for vehicle logo recognition.
Results of different methods for car head and car tail detection.
| Methods | Car Head | Car Tail |
|---|---|---|
| SSD [ | 95.4% | 92.7% |
| YOLO [ | 96.1% | 93.5% |
| Faster R-CNN [ | 98.3% | 96.2% |
Figure 2Our network of constrained region detection, include VGG16, region proposal network and Fast R-CNN detector.
Figure 3Anchors of different scales matching the ground truth objects; (a) Not using “copy-pasting” strategy; (b) Using “copy-pasting” strategy.
Figure 4Process of constrained region segmentation and the copy-pasting strategy; (A) and (B) are original vehicle sample; (a) and (d) show the detected constrained region; (b) and (e) are the cropped constrained region; (c) and (f) show the images processed with “copy-pasting” strategy.
Figure 5The framework of proposed SSFPD network.
Figure 6Different building blocks. (a): A block of ResNet (b): A block of ResNeXt with cardinality of 32. (c) A block equivalent to (b).
The architecture of ResNeXt-101.
| Layers | Output |
|---|---|
| pool |
|
| resx1_eleswise to resx7_eleswise |
|
| resx8_eleswise to resx30_eleswise |
|
| resx30_eleswise to resx33_eleswise |
|
Figure 7A building block of FPN.
Figure 8Samples of 13 logo classes.
Test images sets in various complex environments.
| Test Set | Conditions | Image Amount |
|---|---|---|
| CVLD_weather | Fog, Snow and Rain | 920 |
| CVLD_night | Night | 665 |
| CVLD_tilt | Tilt | 750 |
Figure 9Some example in CVLD dataset: (a) fog (b) snow (c) rain (d) day (e) night (f) tilt.
The resolution of each feature maps used for prediction.
| Layer | Resolution |
|---|---|
| FM 1 |
|
| FM 2 |
|
| FM 3 |
|
| FM 4 |
|
| Resx30_elewise_relu/Conv3_2 |
|
| Pool 2 |
|
Performance comparison of different design for SSFPD.
| Architect of SSFPD | The Final Used Architect | Not Using Copy-Pasting Strategy | Using Resx33_elewise, Not Using Resx7_elewise and Resx30_elewise | Not Using FPN |
|---|---|---|---|---|
| mAP | 93.79% | 91.69% | 89.96% | 86.47% |
Accuracy comparison on CVLD dataset.
| Methods | Network | mAP | Testing Time | Memory | Input Resolution |
|---|---|---|---|---|---|
| SSD [ | VGG 16 | 79.2% | 23 ms | 110.7 M |
|
| ResNext-101 | 85.7% | 45 ms | 133.1 M |
| |
| Faster R-CNN [ | VGG 16 | 81.9% | 30 ms | 217.9 M |
|
| ResNext-101 | 86.3% | 56 ms | 346.3 M |
| |
| YOLO v3 [ | DarkNet-53 | 82.7% | 20 ms | 226.6 M |
|
| Resnext-101 | 89.8% | 49 ms | 346.3 M |
| |
| Pre-training CNN [ | --- | 88.9% | 21 ms | 88.6 M |
|
| MTCNN [ | --- | 90.4% | 34 ms | 101.5 M |
|
| Proposed method | ReaNext-101 | 91.7% | 52 ms | 169.1 M |
|
Accuracy comparison on public dataset.
| Methods | mAP | Testing Time |
|---|---|---|
| MFM [ | 94% | 1020 ms |
| M-SIFT [ | 94.6% | 816 ms |
| MTCNN [ | 98.76% | 35 ms |
| Pre-training CNN [ | 99.07% | 12 ms |
| proposed method (SSFPD) | 99.26% | 52 ms |
| proposed method (SSFPD + enhanced matching) | 99.52% | 108 ms |
Accuracy comparison on complex conditions.
| Testing Set | Accuracy | ||
|---|---|---|---|
| M-SIFT [ | Pretraining CNN [ | Proposed Method | |
| CVLD_weather | 74.8% | 77.4% | 80.6% |
| CVLD_night | 77.6% | 82.1% | 84.0% |
| CVLD_tilt | 79.9% | 83.7% | 86.5% |
Performance on different testing sets.
| Testing Set | Real Result | Prediction | Recall | Precision | Accuracy | |
|---|---|---|---|---|---|---|
| Positive | Negative | |||||
| CVLD_weather | True | 764 | 18 | 83.04% | 95.98% | 80.61% |
| False | 32 | 156 | ||||
| CVLD_night | True | 579 | 12 | 87.06% | 95.70% | 84.06% |
| False | 26 | 86 | ||||
| CVLD_tilt | True | 663 | 15 | 88.4% | 97.36% | 86.59% |
| False | 18 | 87 | ||||
Figure 10Success examples under complicated environments.