| Literature DB >> 36051585 |
Juntong Yun1,2, Du Jiang1,3,4, Ying Liu2,4, Ying Sun1,3,4, Bo Tao1,3,4, Jianyi Kong2,3,4, Jinrong Tian1,2, Xiliang Tong2,4, Manman Xu1,2,3, Zifan Fang5.
Abstract
The continuous development of deep learning improves target detection technology day by day. The current research focuses on improving the accuracy of target detection technology, resulting in the target detection model being too large. The number of parameters and detection speed of the target detection model are very important for the practical application of target detection technology in embedded systems. This article proposed a real-time target detection method based on a lightweight convolutional neural network to reduce the number of model parameters and improve the detection speed. In this article, the depthwise separable residual module is constructed by combining depthwise separable convolution and non-bottleneck-free residual module, and the depthwise separable residual module and depthwise separable convolution structure are used to replace the VGG backbone network in the SSD network for feature extraction of the target detection model to reduce parameter quantity and improve detection speed. At the same time, the convolution kernels of 1 × 3 and 3 × 1 are used to replace the standard convolution of 3 × 3 by adding the convolution kernels of 1 × 3 and 3 × 1, respectively, to obtain multiple detection feature graphs corresponding to SSD, and the real-time target detection model based on a lightweight convolutional neural network is established by integrating the information of multiple detection feature graphs. This article used the self-built target detection dataset in complex scenes for comparative experiments; the experimental results verify the effectiveness and superiority of the proposed method. The model is tested on video to verify the real-time performance of the model, and the model is deployed on the Android platform to verify the scalability of the model.Entities:
Keywords: Deep learning; MobileNets-SSD; depthwise separable convolution; residual module; target detection
Year: 2022 PMID: 36051585 PMCID: PMC9426345 DOI: 10.3389/fbioe.2022.861286
Source DB: PubMed Journal: Front Bioeng Biotechnol ISSN: 2296-4185
FIGURE 1SSD network structure.
FIGURE 2MobileNet-SSD network structure.
FIGURE 3Standard convolution and depthwise separable convolution. (A) Standard convolution. (B) Depthwise separable convolution.
FIGURE 4Residual learning.
FIGURE 5Two types of residual modules. (A) No-bottleneck residual module. (B) Bottleneck residual module.
FIGURE 6The depthwise separable residual module structure.
Number of module parameters with different residuals.
| Residual block | Bt (K) | Non-Bt (K) | DS-Bt (K) | DS-non-Bt (K) | |
|---|---|---|---|---|---|
| In_Out_C | 64 | 4.35 | 36.86 | 2.77 | 4.67 |
| 256 | 69.63 | 589.82 | 35.65 | 67.84 |
The structure of a real-time target detection algorithm based on a lightweight convolutional neural network.
| Network layer | Output size | Convolution Kernel size | Step |
|---|---|---|---|
| Input | 300 × 300 × 3 | ||
| Conv1 | 150 × 150 × 32 | 3 × 3,32 | 2 |
| DW1 | 150 × 150 × 64 | 3 × 3,64 | 1 |
| DS-Res2 | 150 × 150 × 64 | 3 × 3,64 3 × 3,64 | 1 |
| DW3 | 75 × 75 × 128 | 1 × 1,128 | 2 |
| DW4 | 75 × 75 × 128 | 1 × 1,128 | 1 |
| DS-Res5 | 75 × 75 × 128 | 3 × 3,128 3 × 3,128 | 1 |
| DW6 | 38 × 38 × 256 | 1 × 1,256 | 2 |
| DW7 | 38 × 38 × 256 | 1 × 1,256 | 1 |
| DS-Res8 | 38 × 38 × 256 | 3 × 3,256 3 × 3,256 | 1 |
| DW9 | 19 × 19 × 512 | 1 × 1,512 | 2 |
| DW (10–14) | 19 × 19 × 512 | (1 × 1,256)×5 | 1 |
| DS-Res15 | 19 × 19 × 512 | 3 × 3,512 3 × 3,512 | 1 |
| DW16 | 10 × 10 × 1024 | 1 × 1,1024 | 2 |
| DW17 | 10 × 10 × 1024 | 1 × 1,1024 | 1 |
| Conv2 | 10 × 10 × 256 | 1 × 1,256 | 1 |
| Alter Conv1 | 5 × 5 × 256 | 3 × 3,256 | 2 |
| Conv3 | 5 × 5 × 128 | 1 × 1,128 | 1 |
| Alter Conv2 | 3 × 3 × 256 | 3 × 3,256 | 2 |
| Conv4 | 3 × 3 × 128 | 1 × 1,128 | 1 |
| Alter Conv3 | 2 × 2 × 256 | 3 × 3,256 | 2 |
| Conv5 | 2 × 2 × 64 | 1 × 1,64 | 1 |
| Alter Conv5 | 1 × 1 × 128 | 3 × 3,128 | 2 |
Parameters related to the experimental environment.
| Category name | Parameter |
|---|---|
| operating system | Windows 10 |
| CPU | AMD Ryzen 7 |
| GPU | NVIDIA GeForce RTX 2070 |
| Cuda with Cudnn | 10.0/7.6.5 |
| Python | 3.6 |
| Tensorflow, Keras | 1.13.2/2.1.5 |
| Opencv | 4.5.1 |
FIGURE 7Color images of different angles, backgrounds, and lighting.
FIGURE 8Training of trial target detection model based on a lightweight convolutional neural network. (A) Training set loss. (B) Validation set loss.
FIGURE 9Comparison of detection accuracy between SSD and lightweight target detection algorithms for various classes. (A) SSD. (B) Improved MobileNet-SSD. (C) MobileNet-SSD. (D) Tiny-YOLOv3.
Performance comparison between SSD and lightweight target detection algorithms.
| Evaluation standard algorithm | mAP, % | FPS | MB | Training time/min |
|---|---|---|---|---|
| SSD | 87.13 | 26 | 93.2 | 37 |
| Improved MobileNet-SSD | 87.33 | 47 | 27.3 | 12.6 |
| Tiny-YOLOv3 | 66.57 | 52 | 33.2 | 15.3 |
| MobileNet-SSD | 67.02 | 62 | 26.8 | 11.4 |
FIGURE 10Comparison of detection effects between SSD and the lightweight target detection model. (A) SSD. (B) Improved MobileNet-SSD. (C) MobileNet-SSD. (D) Tiny-YOLOv3.
FIGURE 11Detection effect of real-time detection model on video.
FIGURE 12Deployment of real-time detection model on the Android platform.