| Literature DB >> 26778946 |
Sapan Agarwal1, Tu-Thach Quach2, Ojas Parekh3, Alexander H Hsia1, Erik P DeBenedictis3, Conrad D James1, Matthew J Marinella1, James B Aimone3.
Abstract
The exponential increase in data over the last decade presents a significant challenge to analytics efforts that seek to process and interpret such data for various applications. Neural-inspired computing approaches are being developed in order to leverage the computational properties of the analog, low-power data processing observed in biological systems. Analog resistive memory crossbars can perform a parallel read or a vector-matrix multiplication as well as a parallel write or a rank-1 update with high computational efficiency. For an N × N crossbar, these two kernels can be O(N) more energy efficient than a conventional digital memory-based architecture. If the read operation is noise limited, the energy to read a column can be independent of the crossbar size (O(1)). These two kernels form the basis of many neuromorphic algorithms such as image, text, and speech recognition. For instance, these kernels can be applied to a neural sparse coding algorithm to give an O(N) reduction in energy for the entire algorithm when run with finite precision. Sparse coding is a rich problem with a host of applications including computer vision, object tracking, and more generally unsupervised learning.Entities:
Keywords: energy; memristor; neuromorphic computing; resistive memory; sparse coding
Year: 2016 PMID: 26778946 PMCID: PMC4701906 DOI: 10.3389/fnins.2015.00484
Source DB: PubMed Journal: Front Neurosci ISSN: 1662-453X Impact factor: 4.677
Figure 1Analog resistive memories can be used to reduce the energy of a vector-matrix multiply. The conductance of each resistive memory represents a weight. Analog input values are represented by the input voltages or input pulse lengths, and outputs are represented by current values. This allows all the read operations, multiplication operations, and sum operations to occur in a single step. A conventional architecture must perform these operations sequentially for each weight resulting in a higher energy and delay.
Energy scaling for different precision requirements.
| 1 | |||
Figure 2A typical SRAM array. Each row/wordline must be accessed sequentially.
Figure 3A parallel write is illustrated. Weight W is updated by x × y. In order to achieve a multiplicative effect the x are encoded in time while the y are encoded in the height of a voltage pulse. The resistive memory will only train when x is non-zero. The height of y determines the strength of training when x is non-zero.
Figure 4A system would consist of individual crossbar based cores that communicate with each other through a communications bus and routing system.
Figure 5If the data is not the same shape as the array, the input data will come from a single router, and the output data will need to go to a single computation unit. At best, the extra wire length to go to the input/output units plus the row/column wire length will be O[max(N,M)].
The energy scaling for all the operations is given.
| Multiplication: | |||
| Multiplication: | |||
| Multiplication/Training: | |||
| Threshold: | |||
| Subtraction: | |||
| Sign function: sgn( | |||
| Vector: ( | |||
| Vector: sgn( | |||
| Matrix: | 1 | ||
| Read: | 1 | ||
| Write: | 1 | ||
We consider the finite precision case such that 2.