| Literature DB >> 33806329 |
Pankaj Bhowmik1, Md Jubaer Hossain Pantho1, Christophe Bobda1.
Abstract
Cameras are widely adopted for high image quality with the rapid advancement of complementary metal-oxide-semiconductor (CMOS) image sensors while offloading vision applications' computation to the cloud. It raises concern for time-critical applications such as autonomous driving, surveillance, and defense systems since moving pixels from the sensor's focal plane are expensive. This paper presents a hardware architecture for smart cameras that understands the salient regions from an image frame and then performs high-level inference computation for sensor-level information creation instead of transporting raw pixels. A visual attention-oriented computational strategy helps to filter a significant amount of redundant spatiotemporal data collected at the focal plane. A computationally expensive learning model is then applied to the interesting regions of the image. The hierarchical processing in the pixels' data path demonstrates a bottom-up architecture with massive parallelism and gives high throughput by exploiting the large bandwidth available at the image source. We prototype the model in field-programmable gate array (FPGA) and application-specific integrated circuit (ASIC) for integrating with a pixel-parallel image sensor. The experiment results show that our approach achieves significant speedup while in certain conditions exhibits up to 45% more energy efficiency with the attention-oriented processing. Although there is an area overhead for inheriting attention-oriented processing, the achieved performance based on energy consumption, latency, and memory utilization overcomes that limitation.Entities:
Keywords: ASIC; CNN; FPGA; computation at sensor; computer vision; image relevance
Year: 2021 PMID: 33806329 PMCID: PMC7961745 DOI: 10.3390/s21051757
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576