| Literature DB >> 32197308 |
Milena Andrighetti1, Giovanna Turvani1, Giulia Santoro1, Marco Vacca1, Andrea Marchesin1, Fabrizio Ottati1, Massimo Ruo Roch1, Mariagrazia Graziano2, Maurizio Zamboni1.
Abstract
To live in the information society means to be surrounded by billions of electronic devices full of sensors that constantly acquire data. This enormous amount of data must be processed and classified. A solution commonly adopted is to send these data to server farms to be remotely elaborated. The drawback is a huge battery drain due to high amount of information that must be exchanged. To compensate this problem data must be processed locally, near the sensor itself. But this solution requires huge computational capabilities. While microprocessors, even mobile ones, nowadays have enough computational power, their performance are severely limited by the Memory Wall problem. Memories are too slow, so microprocessors cannot fetch enough data from them, greatly limiting their performance. A solution is the Processing-In-Memory (PIM) approach. New memories are designed that can elaborate data inside them eliminating the Memory Wall problem. In this work we present an example of such a system, using as a case of study the Bitmap Indexing algorithm. Such algorithm is used to classify data coming from many sources in parallel. We propose a hardware accelerator designed around the Processing-In-Memory approach, that is capable of implementing this algorithm and that can also be reconfigured to do other tasks or to work as standard memory. The architecture has been synthesized using CMOS technology. The results that we have obtained highlights that, not only it is possible to process and classify huge amount of data locally, but also that it is possible to obtain this result with a very low power consumption.Entities:
Keywords: big data; bitmap indexing; internet of things; memory wall; processing in memory
Year: 2020 PMID: 32197308 PMCID: PMC7146182 DOI: 10.3390/s20061681
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1(A) Given a table, bitmap indexing transforms each column in as many bitmap as the number of possible key-values for that column (B) In order to answer a query logic bitwise operations are to be performed (C) Practical scheme of the execution of the query.
Figure 2Column-oriented memory organization.
Figure 3(A) Overview of the complete architecture. (B) Structure of the duo Bank-Breaker. (C) Insight of the Processing-In-Memory (PIM) cell.
Figure 4(A) Composition of a complete query. (B) Preliminary stages.
Figure 5(A) Expected waveform of a LIM single same bank AND operation. (B) Waveform of a LIM single same bank AND operation. (C) Expected waveform of a PIM multiple operations. (D) Simulated waveform of a PIM multiple-bank operation.
Synthesis of the fundamental element.
| Memory | Logic | Cell | |
|---|---|---|---|
| Non-Combinational Area [mm | 9.31 | 2.12 | 11.43 |
| Combinational Area [mm | 5.32 | 15.43 | 20.75 |
| Total Area [mm | 32.18 | ||
| Delay [ns] | 0.45 |
Synthesis results for 45 nm and 28 nm CMOS technologies.
| Parameter | Value (45 nm) | Value (28 nm) |
|---|---|---|
| Total area [mm | 2.33 | 1.058 |
| 153.4 | 574.7 | |
| Total Power [mW] | 49.7 | 14.07 |
Figure 6Relation between number of segments in the counter and resulting delay.
Clock cycles comparison for a single query execution.
|
|
| |
|---|---|---|
|
| 5 | 9 |
|
| 3 | 5 |
|
| 1 | 3 |
|
| 1 | 2 |