| Literature DB >> 35200744 |
Tiantai Deng1, Danny Crookes2, Roger Woods2, Fahad Siddiqui2.
Abstract
Developing Field Programmable Gate Array (FPGA)-based applications is typically a slow and multi-skilled task. Research in tools to support application development has gradually reached a higher level. This paper describes an approach which aims to further raise the level at which an application developer works in developing FPGA-based implementations of image and video processing applications. The starting concept is a system of streamed soft coprocessors. We present a set of soft coprocessors which implement some of the key abstractions of Image Algebra. Our soft coprocessors are designed for easy chaining, and allow users to describe their application as a dataflow graph. A prototype implementation of a development environment, called SCoPeS, is presented. An application can be modified even during execution without requiring re-synthesis. The paper concludes with performance and resource utilization results for different implementations of a sample algorithm. We conclude that the soft coprocessor approach has the potential to deliver better performance than the soft processor approach, and can improve programmability over dedicated HDL cores for domain-specific applications while achieving competitive real time performance and utilization.Entities:
Keywords: FPGA; image algebra; image processing; soft coprocessor; soft processor
Year: 2022 PMID: 35200744 PMCID: PMC8880448 DOI: 10.3390/jimaging8020042
Source DB: PubMed Journal: J Imaging ISSN: 2313-433X
Figure 1Qualitative representation of programmability vs. performance.
Figure 2The core function for the Sobel operation when using a skeleton SCP.
Figure 3The GUI for creating a new project configuration.
Figure 4(a) Example of the textual description of a DFG for Otsu after an Open operation. (b) Example of the code generated by the TCG tool from the DFG in Figure 4a.
Figure 5Data flow and buffering for the four different operation types (clockwise: Global, Neighborhood, Block and Point operations).
Figure 6Stream-based parameter distribution.
Figure 7From text-based DFG to hardware platform through Xilinx SDK.
Comparison between SCP (in Minimum Area mode) and IPPRO in utilization and performance.
|
|
|
|
|
|
|
| Point | 1659 | 2015 | 0 | 3 | 186 |
| Neighborhood Basic | 1104 | 1404 | 5 | 9 | 127 |
| Neighborhood Complex | 4963 | 7141 | 5 | 72 | 125 |
| Global | 622 | 998 | 0 | 0 | 189 |
|
|
|
|
|
| |
| Point (8 core) | 12,279 | 10,941 | 18.5 | 8 | 120 |
| Neighborhood Basic (6 core) | 13,202 | 11,826 | 32.5 | 6 | 76 |
SCP utilization and performance (in Maximum Performance mode).
| SCPs | FFs | LUTs | BRAMs | DSPs | FPS |
|---|---|---|---|---|---|
| Point | 3346 | 2965 | 0 | 3 | 556 |
| Neighborhood Basic | 2309 | 1963 | 5 | 9 | 380 |
| Neighborhood Complex | 9862 | 12,368 | 5 | 72 | 374 |
| Global | 1432 | 1353 | 0 | 0 | 568 |
Ratios for SCP to IPPro for performance and utilization (>1 is worse).
|
|
|
|
| ||||
|
|
|
|
|
|
| ||
| Point | SCP | 150 MHz | 1 | 1 | 1 | 1 | 1 |
| IPPro (8 core) | 150 MHz | 1.5 | 7.4 | 5.4 | --- | 2.7 | |
| Neighborhood | SCP | 150 MHz | 1 | 1 | 1 | 1 | 1 |
| IPPro (6 core) | 150 MHz | 2.4 | 8.0 | 5.9 | --- | 2.0 | |
|
|
|
|
| ||||
|
|
|
|
|
|
| ||
| Point | SCP | 150 MHz | 1 | 1 | 1 | 1 | 1 |
| IPPro (8 core) | 150 MHz | 4.6 | 3.7 | 3.7 | --- | 2.7 | |
| Neighborhood | SCP | 150 MHz | 1 | 1 | 1 | 1 | 1 |
| IPPro (6 core) | 150 MHz | 7.3 | 5.7 | 6.0 | --- | 0.7 | |
Comparison between a generic and a function-specific SCP.
| SCP Type | FFs | LUTs | BRAMs | DSPs | FPS |
|---|---|---|---|---|---|
| Generic | 9862 | 12,368 | 5 | 72 | 125 |
| Function-specific | 932 | 1107 | 2 | 3 | 128 |