| Literature DB >> 35383216 |
Lukas Mennel1, Dmitry K Polyushkin1, Dohyun Kwak1, Thomas Mueller2.
Abstract
As conventional frame-based cameras suffer from high energy consumption and latency, several new types of image sensors have been devised, with some of them exploiting the sparsity of natural images in some transform domain. Instead of sampling the full image, those devices capture only the coefficients of the most relevant spatial frequencies. The number of samples can be even sparser if a signal only needs to be classified rather than being fully reconstructed. Based on the corresponding mathematical framework, we developed an image sensor that can be trained to classify optically projected images by reading out the few most relevant pixels. The device is based on a two-dimensional array of metal-semiconductor-metal photodetectors with individually tunable photoresponsivity values. We demonstrate its use for the classification of handwritten digits with an accuracy comparable to that achieved by readout of the full image, but with lower delay and energy consumption.Entities:
Year: 2022 PMID: 35383216 PMCID: PMC8983698 DOI: 10.1038/s41598-022-09594-y
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Theoretical background and operation principle. (a) Schematic illustration of the setup. An optical image is projected onto the face of the image sensor with photoresponsivity values that vary from pixel to pixel. (b) A binary linear classifier assigns an image to one of two possible classes I or II, depending whether or not the inner product is larger than some threshold. In our implementation, the inner product is realized by summing up the photocurrents produced by all detector elements. (c) Photoresponsivities for a sensor that has been trained as a linear SVM for the classification of zeros and ones from the MNIST dataset. Almost all pixels exhibit non-zero photoresponsivity values. (d) Natural images have low-dimensional structure. This allows to construct a sparse photoresponsivity vector for classification. (e) Results for the same binary classification task as in c. Comparable performance is achieved with 99.2% of the detector elements having zero responsivity.
Figure 2Image sensor architecture and characterization. (a) Microscope image of the sensor, with schematic illustrations of the external row/column decoders and integrating output (left). Scale bar, 200 µm. The chip size is 2.75 mm2. A detailed view of one of the MSM photodetectors is presented in the inset and a schematic illustration is in the picture to the right. Each of the detector elements is 90 × 90 µm2 in size. Details regarding the electrical measurement setup can be found in Supplementary Figure S5. (b) Bias voltage dependent device currents for all 196 detectors with (red lines) and without (blue lines) optical illumination (~ 160 W/m2). The detectors are operated in the range ± 5 V to ± 10 V.
Figure 3Image sensor operation and performance evaluation. (a) Relevant pixel locations (bottom) and applied bias voltages (top) for the binary classification task discussed in the main text. (b) Temporal evolution of the sensor output for more than 2000 samples from the dataset. Red (blue) lines show cases in which a “0” (“1”) has been projected onto the sensor. The black lines show two representative examples with corresponding MNIST digits. (c) Experimental confusion matrix. A classification accuracy of 98.3% is achieved. (d) Histogram of sensor output as determined from the measurements in b. The dashed line indicates the decision threshold.