| Literature DB >> 31060279 |
Longlong Liao1, Kenli Li2, Canqun Yang3, Jie Liu4.
Abstract
When measurement rates grow, most Compressive Sensing (CS) methods suffer from an increase in overheads of transmission and storage of CS measurements, while reconstruction quality degrades appreciably when measurement rates reduce. To solve these problems in real scenarios such as large-scale distributed surveillance systems, we propose a low-cost image CS approach called MRCS for object detection. It predicts key objects using the proposed MYOLO3 detector, and then samples the regions of the key objects as well as other regions using multiple measurement rates to reduce the size of sampled CS measurements. It also stores and transmits half-precision CS measurements to further reduce the required transmission bandwidth and storage space. Comprehensive evaluations demonstrate that MYOLO3 is a smaller and improved object detector for resource-limited hardware devices such as surveillance cameras and aerial drones. They also suggest that MRCS significantly reduces the required transmission bandwidth and storage space by declining the size of CS measurements, e.g., mean Compression Ratios (mCR) achieves 1.43-22.92 on the VOC-pbc dataset. Notably, MRCS further reduces the size of CS measurements by half-precision representations. Subsequently, the required transmission bandwidth and storage space are reduced by one half as compared to the counterparts represented with single-precision floats. Moreover, it also substantially enhances the usability of object detection on reconstructed images with half-precision CS measurements and multiple measurement rates as compared to its counterpart, using a single low measurement rate.Entities:
Keywords: compression ratio; compressive sensing; half-precision float; multiple measurement rates; object detection
Year: 2019 PMID: 31060279 PMCID: PMC6539613 DOI: 10.3390/s19092079
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1Illustration of the proposed MRCS. The main task of MRCS is CS sampling of natural images with multiple Measurement Rates (MRs), e.g., 0.25 and 0.01. denotes CS measurements are represented with half-precision floats.
The architecture of MYOLO3. Each line describes a sequence of 1 or more identical layers, repeated n times. All layers in the same line have the same number of output channels. s denotes the stride used by each layer. t denotes the expansion factor used to resize input feature maps described by the height (H), width (W) and number of input channels (), i.e., . The number of detected object classes is denoted by .
| Layers | Operation Type | Input | Filter Size |
|
|
|
|
|---|---|---|---|---|---|---|---|
| 1 | Convolution | 416 × 416 × 3 | 3 × 3 | 2 | - | 32 | 1 |
| 2–4 | Bottleneck residual block | 208 × 208 × 32 | - | 1 | 1 | 16 | 1 |
| 5–6 | Bottleneck residual block | 208 × 208 × 16 | - | 2 | 6 | 24 | 1 |
| 8–10 | Bottleneck residual block | 104 × 104 × 24 | - | 1 | 6 | 24 | 1 |
| 11–13 | Bottleneck residual block | 104 × 104 × 24 | - | 2 | 6 | 32 | 1 |
| 14–19 | Bottleneck residual block | 52 × 52 × 32 | - | 1 | 6 | 32 | 2 |
| 20–22 | Bottleneck residual block | 52 × 52 × 32 | - | 2 | 6 | 64 | 1 |
| 23–31 | Bottleneck residual block | 26 × 26 × 64 | - | 1 | 6 | 64 | 3 |
| 32–34 | Bottleneck residual block | 26 × 26 × 64 | - | 1 | 6 | 96 | 1 |
| 35–40 | Bottleneck residual block | 26 × 26 × 96 | - | 1 | 6 | 96 | 2 |
| 41–43 | Bottleneck residual block | 26 × 26 × 96 | - | 2 | 6 | 160 | 1 |
| 44–49 | Bottleneck residual block | 13 × 13 × 160 | - | 1 | 6 | 160 | 2 |
| 50–52 | Bottleneck residual block | 13 × 13 × 160 | - | 1 | 6 | 320 | 1 |
| 53 | Convolution | 13 × 13 × 320 | 1 × 1 | 1 | - | 1280 | 1 |
| 54 | Convolution | 13 × 13 × 1280 | 1 × 1 | 1 | - | 512 | 1 |
| 55–62 | Depthwise separable convolution | 13 × 13 × 512 | - | 1 | - | 512 | 4 |
| 63 | Convolution | 13 × 13 × 512 | 1 × 1 | 1 | - | 75 ( | 1 |
| 64 | YOLO | - | - | - | - | - | 1 |
| 65 | Route | 60 | - | - | - | - | 1 |
| 66 | Convolution | 13 × 13 × 512 | 1 × 1 | 1 | - | 256 | 1 |
| 67 | Nearest neighbor upsampling | - | - | 2 | - | - | 1 |
| 68 | Route | 67, 40 | - | - | - | - | 1 |
| 69 | Convolution | 26 × 26 × 352 | 1 × 1 | 1 | - | 256 | 1 |
| 70–77 | Depthwise separable convolution | 26 × 26 × 256 | - | 1 | - | 256 | 4 |
| 78 | Convolution | 26 × 26 × 256 | 1 × 1 | 1 | - | 75 ( | 1 |
| 79 | YOLO | - | - | - | - | - | 1 |
| 80 | Route | 75 | - | - | - | - | 1 |
| 81 | Convolution | 26 × 26 × 256 | 1 × 1 | 1 | - | 128 | 1 |
| 82 | Nearest neighbor upsampling | - | - | 2 | - | - | 1 |
| 83 | Route | 82, 19 | - | - | - | - | 1 |
| 84 | Convolution | 52 × 52 × 160 | 1 × 1 | 1 | - | 128 | 1 |
| 85–92 | Depthwise separable convolution | 52 × 52 × 128 | - | 1 | - | 128 | 4 |
| 93 | Convolution | 52 × 52 × 128 | 1 × 1 | 1 | - | 75 ( | 1 |
| 94 | YOLO | - | - | - | - | - | 1 |
Figure 2Structure of bottleneck convolution blocks transforming from input channel to output channels, with an expansion factor t. Left: a block for the depthwise convolution with stride = 1. Right: a block for the depthwise convolution with stride = 2.
Figure 3Illustration of the proposed depthwise feature pyramid network. ⨁ denotes a route layer.
Performance comparison of real-time object detectors on PASCAL VOC 2007. M denotes million, mAP0.5 denotes that the accuracy predicted on the original PASCAL VOC 2007 test dataset when the value of IoU is 0.5.
| Model | Input Size | Computation Cost | Model Size | mAP0.5 |
|---|---|---|---|---|
| Tiny-YOLOv2 [ | 416 × 416 | 3490 M | 15.86 M | 57.1 |
| Tiny-YOLOv3 [ | 416 × 416 | 2742 M | 8.72 M | 58.4 |
| MobileNet+SSD [ | 300 × 300 | 1150 M | 5.77 M | 68.0 |
| PeleeNet [ | 304 × 304 | 1210 M | 5.43 M | 70.9 |
|
| 416 × 416 | 1978 M | 4.80 M | 74.0 |
Performance of MRCS for CS sampling and reconstruction with multiple MRs. Bwidth denotes the size of required network bandwidth for real-time transmitting CS measurements generated when MRCS samples 25 images per second. AP0.5 denotes the accuracy predicted for one certain class on the VOC-revise dataset when the value of IoU is 0.5.
|
| Single-Precision CS Reconstruction | Half-Precision CS Reconstruction | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| mCR | Bwidth | mPSNR | AP0.5 (%) | mCR | Bwidth | mPSNR | AP0.5 (%) | |||||
| Person | Bicycle | Car | Person | Bicycle | Car | |||||||
| 0.25/0.25 | 0.89 | 14.21 | 26.11 | 74.5 | 74.6 | 77.3 | 1.78 | 7.11 | 26.11 | 74.5 | 74.6 | 77.3 |
| 0.10/0.10 | 2.23 | 5.69 | 23.20 | 63.1 | 58.4 | 60.8 | 4.45 | 2.85 | 23.20 | 63.1 | 58.1 | 60.9 |
| 0.04/0.04 | 5.64 | 2.26 | 20.84 | 40.9 | 24.6 | 33.8 | 11.28 | 1.12 | 20.84 | 41.0 | 24.8 | 33.7 |
| 0.01/0.01 | 24.26 | 0.53 | 18.14 | 6.9 | 4.6 | 4.9 | 48.53 | 0.26 | 18.14 | 6.9 | 4.6 | 5.0 |
| 0.25/0.10 | 1.43 | 9.38 | 24.61 | 72.3 | 73.4 | 74.0 | 2.87 | 4.69 | 24.61 | 72.3 | 73.3 | 73.9 |
| 0.25/0.04 | 2.12 | 7.41 | 23.19 | 69.5 | 70.5 | 71.2 | 4.24 | 3.71 | 23.19 | 69.5 | 71.1 | 71.1 |
| 0.25/0.01 | 3.23 | 6.43 | 21.22 | 65.8 | 67.6 | 70.4 | 6.45 | 3.22 | 21.22 | 65.7 | 70.4 | 70.4 |
| 0.10/0.04 | 3.61 | 3.74 | 22.04 | 60.4 | 55.0 | 57.4 | 7.21 | 1.87 | 22.04 | 60.6 | 57.4 | 57.4 |
| 0.10/0.01 | 6.35 | 2.76 | 20.37 | 58.5 | 55.3 | 60.1 | 12.67 | 1.38 | 20.38 | 58.5 | 60.2 | 60.2 |
| 0.04/0.01 | 11.51 | 1.27 | 19.45 | 40.0 | 24.4 | 37.1 | 22.92 | 0.64 | 19.45 | 40.0 | 37.1 | 37.1 |
| 1.00/0.25 | 0.89 | 14.22 | 31.12 | 79.1 | 81.6 | 82.5 | 1.29 | 10.18 | 31.12 | 79.2 | 81.6 | 82.4 |
| 1.00/0.10 | 1.44 | 9.38 | 28.07 | 76.5 | 80.0 | 79.8 | 1.94 | 7.76 | 28.07 | 76.5 | 79.9 | 79.8 |
| 1.00/0.04 | 2.12 | 7.42 | 25.68 | 73.6 | 76.9 | 77.6 | 2.68 | 6.78 | 25.68 | 73.6 | 76.9 | 77.6 |
| 1.00/0.01 | 3.23 | 6.44 | 22.72 | 69.4 | 73.0 | 75.4 | 3.66 | 6.29 | 22.72 | 69.4 | 73.0 | 75.4 |