| Literature DB >> 31370261 |
Hasan Erdem Yantır1, Wenzhe Guo2, Ahmed M Eltawil3, Fadi J Kurdahi3, Khaled Nabil Salama4.
Abstract
Current computation architectures rely on more processor-centric design principles. On the other hand, the inevitable increase in the amount of data that applications need forces researchers to design novel processor architectures that are more data-centric. By following this principle, this study proposes an area-efficient Fast Fourier Transform (FFT) processor through in-memory computing. The proposed architecture occupies the smallest footprint of around 0.1 mm 2 inside its class together with acceptable power efficiency. According to the results, the processor exhibits the highest area efficiency ( FFT / s / area ) among the existing FFT processors in the current literature.Entities:
Keywords: associative processor; fast Fourier transform; in-memory computing; non-von neumann architecture
Year: 2019 PMID: 31370261 PMCID: PMC6722736 DOI: 10.3390/mi10080509
Source DB: PubMed Journal: Micromachines (Basel) ISSN: 2072-666X Impact factor: 2.891
Computation types with respect to memory.
| Computation Type | Data Location | Functionality Location | Bandwidth Constraint |
|---|---|---|---|
| Traditional | Separate IC | Processor | Inter-chip Bus |
| Near-memory | Same IC | Processor | In-chip Bus |
| In-memory | Same IC | Memory | Memory Capacity |
Figure 1Associative processor architecture.
LUTs for addition and subtraction.
| Addition | Subtraction | |||||||
|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
| 0 | 0 | 0 | 0 | 0 |
| 0 | 0 |
|
| 0 | 0 | 1 | 0 | 1 |
| 1 | 1 |
|
| 0 | 1 | 0 | 0 | 1 |
| 0 | 1 |
|
| 0 | 1 | 1 | 1 | 0 |
| 0 | 0 |
|
| 1 | 0 | 0 | 0 | 1 |
| 1 | 1 |
|
| 1 | 0 | 1 | 1 | 0 |
| 1 | 0 |
|
| 1 | 1 | 0 | 1 | 0 |
| 0 | 0 |
|
| 1 | 1 | 1 | 1 | 1 |
| 1 | 1 |
|
Figure 2Simple butterfly operation.
Figure 38-point traditional FFT.
Figure 4Pipelined in-memory FFT processor architecture.
Figure 5The ultra-area-efficient FFT processor based on singleton’s FFT and feedback.
Figure 68-point Singleton’s FFT.
Figure 7Directed acyclic graph of a butterfly operation.
Figure 8Dual-issue FFT on the AP.
Comparison of FFT Processors without normalization.
| Specification | AP (F) | AP (P) | [ | [ | [ | [ | [ |
|---|---|---|---|---|---|---|---|
| FFT Size (N) | 1024 | 1024 | 1024 | 256 | 2048 | 1024 | 4096 |
| Technology | 65 nm | 65 nm | 65 nm | 90 nm | 65 nm | 65 nm | 65 nm |
|
|
|
|
| 1 |
|
|
|
| Word-length | 12-bit | 12-bit | 16-bit | 10-bit | 12-bit | 32-bit * | 14-bit |
| Area |
|
|
|
|
|
|
|
| Power | 12 mW | 123 mW | 4.15 mW | 165 mW | 1.01 mW | 60.3 mW | 68.6 mW |
| Throughput/Area ( | 0.89 | 0.89 | 0.03 | 0.47 | 0.015 | 0.22 | 0.67 |
| FOM (FFT/Energy/Area) | 70.4 | 7.09 | 6.82 | 15.3 | 7.04 | 3.60 | 2.37 |
* The bitwidth of the architecture is variable over the FFT stages and the maximum one is 32-bit.
Figure 9Area efficiencies of FFT processors ().
Figure 10Design space exploration for the area-optimized FFT processor.
Figure 11Comparison of normalized Energy/FFT scaling with respect to FFT size.
Figure 12Bitwidth vs. average PSNR and error rate of 1024-point FFT.