| Literature DB >> 33260957 |
Byung-Gil Han1, Joon-Goo Lee1, Kil-Taek Lim1, Doo-Hyun Choi2.
Abstract
With the increase in research cases of the application of a convolutional neural network (CNN)-based object detection technology, studies on the light-weight CNN models that can be performed in real time on the edge-computing devices are also increasing. This paper proposed scalable convolutional blocks that can be easily designed CNN networks of You Only Look Once (YOLO) detector which have the balanced processing speed and accuracy of the target edge-computing devices considering different performances by exchanging the proposed blocks simply. The maximum number of kernels of the convolutional layer was determined through simple but intuitive speed comparison tests for three edge-computing devices to be considered. The scalable convolutional blocks were designed in consideration of the limited maximum number of kernels to detect objects in real time on these edge-computing devices. Three scalable and fast YOLO detectors (SF-YOLO) which designed using the proposed scalable convolutional blocks compared the processing speed and accuracy with several conventional light-weight YOLO detectors on the edge-computing devices. When compared with YOLOv3-tiny, SF-YOLO was seen to be 2 times faster than the previous processing speed but with the same accuracy as YOLOv3-tiny, and also, a 48% improved processing speed than the YOLOv3-tiny-PRN which is the processing speed improvement model. Also, even in the large SF-YOLO model that focuses on the accuracy performance, it achieved a 10% faster processing speed with better accuracy of 40.4% mAP@0.5 in the MS COCO dataset than YOLOv4-tiny model.Entities:
Keywords: SF-YOLO; edge-computing; light-weight; mobile device; object detector; scalable
Year: 2020 PMID: 33260957 PMCID: PMC7729998 DOI: 10.3390/s20236779
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1Processing speed comparison by the number of kernels on edge-computing devices.
Figure 2Block diagram of YOLOv3-tiny architecture.
Description of YOLOv3-tiny architecture.
| No. | Layer Type | Filters | Size/Stride | Output |
|---|---|---|---|---|
| - | input img. | - | - | 416 × 416 × 3 |
| 0 | conv. | 16 | 3 × 3/1 | 416 × 416 × 16 |
| 1 | maxpool | - | 2 × 2/2 | 208 × 208 × 16 |
| 2 | conv. | 32 | 3 × 3/1 | 208 × 208 × 32 |
| 3 | maxpool | - | 2 × 2/2 | 104 × 104 × 32 |
| 4 | conv. | 64 | 3 × 3/1 | 104 × 104 × 64 |
| 5 | maxpool | - | 2 × 2/2 | 52 × 52 × 64 |
| 6 | conv. | 128 | 3 × 3/1 | 52 × 52 × 128 |
| 7 | maxpool | - | 2 × 2/2 | 26 × 26 × 128 |
| 8 | conv. | 256 | 3 × 3/1 | 26 × 26 × 256 |
| 9 | maxpool | - | 2 × 2/2 | 13 × 13 × 256 |
| 10 | conv. | 512 | 3 × 3/1 | 13 × 13 × 512 |
| 11 | maxpool | - | 2 × 2/1 | 13 × 13 × 512 |
| 12 | conv. | 1024 | 3 × 3/1 | 13 × 13 × 1024 |
| 13 | conv. | 256 | 1 × 1/1 | 13 × 13 × 256 |
| 14 | conv. | 512 | 3 × 3/1 | 13 × 13 × 512 |
| 15 | conv. | 255 | 1 × 1/1 | 13 × 13 × 255 |
| 16 | YOLO | - | - | - |
| 17 | route 13 | - | - | 13 × 13 × 256 |
| 18 | conv. | 128 | 1 × 1/1 | 13 × 13 × 128 |
| 19 | up-sample | - | - | 26 × 26 × 128 |
| 20 | route 19.8 | - | - | 26 × 26 × 384 |
| 21 | conv. | 256 | 3 × 3/1 | 26 × 26 × 256 |
| 22 | conv. | 255 | 1 × 1/1 | 26 × 26 × 255 |
| 23 | YOLO | - | - | - |
Figure 3Scalable convolutional blocks.
Figure 4Block diagram of SF-YOLO (medium) architecture.
Description of SF-YOLO (medium) architecture.
| No. | Layer Type | Filters | Size/Stride | Output |
|---|---|---|---|---|
| - | input image | - | - | 416 × 416 × 3 |
| 0 | conv. | 32 | 3 × 3/2 | 208 × 208 × 32 |
| 1 | conv. | 32 | 3 × 3/2 | 104 × 104 × 32 |
| 2 | conv. | 32 | 3 × 3/2 | 52 × 52 × 32 |
| 3–13 | Dense-Res. | 32 + 32 | - | 52 × 52 × 64 |
| Block | 64 | |||
| 14 | maxpool | - | 2 × 2/2 | 26 × 26 × 64 |
| 15–25 | Dense-Res. | 64 + 64 | - | 26 × 26 × 128 |
| Block | 128 | |||
| 26 | maxpool | - | 2 × 2/2 | 13 × 13 × 128 |
| 27–37 | Dense-Res. | 128 + 128 | - | 13 × 13 × 256 |
| Block | 256 | |||
| 38–44 | Recursive | 128, 64 | - | 52 × 52 × 64 |
| Block | ||||
| 45–48 | Residual | 64 | - | 52 × 52 × 64 |
| Block | ||||
| 49 | maxpool | - | 2 × 2/2 | 26 × 26 × 64 |
| 50–60 | Dense-Res. | 64 + 64 | - | 26 × 26 × 128 |
| Block | 128 | |||
| 61 | maxpool | - | 2 × 2/2 | 13 × 13 × 128 |
| 62–72 | Dense-Res. | 128 + 128 | - | 13 × 13 × 256 |
| Block | 256 | |||
| 73 | conv. | 255 | 1 × 1/1 | 13 × 13 × 255 |
| 74 | YOLO | - | - | - |
| 75 | route 60 | - | - | 26 × 26 × 128 |
| 76 | conv. | 255 | 1 × 1/1 | 26 × 26 × 255 |
| 77 | YOLO | - | - | - |
Figure 5Expansion of SF-YOLO by replacing the scalable convolutional blocks.
Processing speed comparison with state-of-the-art light-weight YOLO detectors on the edge-computing devices by COCO test-dev2017 dataset.
| Model | GFLOPs | mAP@0.5 | NANO | TX2 | NX | RTX | i3 | i5 | i7 | i9 |
|---|---|---|---|---|---|---|---|---|---|---|
| (%) | (FPS) | (FPS) | (FPS) | (FPS) | (FPS) | (FPS) | (FPS) | (FPS) | ||
|
| ||||||||||
| YOLOv3-tiny | 5.57 | 33.1 | 17 | 45 | 49 | 166 | 20 | 25 | 26 | 74 |
| YOLOv3-tiny-prn | 3.47 | 33.1 | 23 | 58 | 55 | 166 | 27 | 31 | 33 | 94 |
| YOLOv4-tiny | 6.91 | 40.2 | 18 | 44 | 46 | 165 | 16 | 22 | 23 | 64 |
|
| ||||||||||
| small | 2.59 | 33.1 | 34 | 77 | 77 | 170 | 32 | 39 | 41 | 118 |
| medium | 3.69 | 37.7 | 26 | 62 | 70 | 168 | 26 | 33 | 33 | 96 |
| large | 5.02 | 40.4 | 20 | 49 | 58 | 165 | 21 | 27 | 28 | 79 |
Comparison of GPU profile information.
| Model | GFLOPs/s | Memory Requirement | Power Consumption | ||||||
|---|---|---|---|---|---|---|---|---|---|
| (MB) | (mW) | ||||||||
| NANO | TX2 | NX | NANO | TX2 | NX | NANO | TX2 | NX | |
| YOLOv3-tiny | 94.69 | 250.65 | 272.93 | 966 | 1044 | 1265 | 4217 | 6970 | 9517 |
| YOLOv3-tiny-prn | 79.81 | 201.26 | 190.85 | 921 | 1012 | 1221 | 3759 | 6537 | 9590 |
| YOLOv4-tiny | 124.38 | 304.04 | 317.86 | 1018 | 1053 | 1291 | 3867 | 7151 | 9696 |
| small | 88.06 | 199.43 | 199.43 | 899 | 930 | 1184 | 3520 | 6011 | 7363 |
| medium | 95.94 | 228.78 | 258.3 | 926 | 960 | 1210 | 3737 | 6513 | 8221 |
| large | 100.4 | 245.98 | 291.16 | 953 | 988 | 1244 | 3815 | 6825 | 8587 |
Comparison between various condition of SF-YOLO.
| Model | GFLOPs | mAP@0.5 | NANO | TX2 | NX | RTX | i3 | i5 | i7 | i9 |
|---|---|---|---|---|---|---|---|---|---|---|
| (%) | (FPS) | (FPS) | (FPS) | (FPS) | (FPS) | (FPS) | (FPS) | (FPS) | ||
|
| ||||||||||
| small | 1.53 | 29.5 | 47 | 82 | 84 | 220 | 56 | 66 | 67 | 183 |
| medium | 2.19 | 33.6 | 35 | 71 | 84 | 218 | 43 | 54 | 55 | 147 |
| large | 2.97 | 35.8 | 27 | 54 | 83 | 217 | 35 | 44 | 46 | 121 |
|
| ||||||||||
| small | 5.52 | 34.6 | 17 | 41 | 44 | 154 | 15 | 19 | 19 | 60 |
| medium | 7.89 | 40.6 | 13 | 31 | 36 | 153 | 12 | 16 | 16 | 48 |
| large | 10.73 | 43.6 | 10 | 25 | 30 | 152 | 10 | 13 | 13 | 40 |
|
| ||||||||||
| small | 1.58 | 31.3 | 43 | 70 | 84 | 215 | 24 | 27 | 27 | 102 |
| medium | 2.24 | 35.4 | 33 | 60 | 84 | 217 | 22 | 25 | 25 | 88 |
| large | 3.02 | 37.6 | 26 | 49 | 82 | 212 | 20 | 23 | 23 | 78 |
|
| ||||||||||
| small | 2.67 | 34.8 | 31 | 62 | 76 | 168 | 15 | 16 | 16 | 63 |
| medium | 3.78 | 39.3 | 24 | 51 | 65 | 165 | 13 | 15 | 15 | 56 |
| large | 5.11 | 41.3 | 19 | 42 | 54 | 164 | 12 | 14 | 14 | 47 |
|
| ||||||||||
| small | 5.71 | 36.0 | 15 | 31 | 40 | 152 | 7 | 8 | 8 | 30 |
| medium | 8.08 | 41.2 | 12 | 28 | 33 | 152 | 6 | 7 | 7 | 27 |
| large | 10.92 | 44.2 | 9 | 21 | 28 | 151 | 5 | 6 | 6 | 24 |