| Literature DB >> 33172017 |
José M Cecilia1, Juan-Carlos Cano1, Juan Morales-García2, Antonio Llanes2, Baldomero Imbernón2.
Abstract
Internet of Things (IoT) is becoming a new socioeconomic revolution in which data and immediacy are the main ingredients. IoT generates large datasets on a daily basis but it is currently considered as "dark data", i.e., data generated but never analyzed. The efficient analysis of this data is mandatory to create intelligent applications for the next generation of IoT applications that benefits society. Artificial Intelligence (AI) techniques are very well suited to identifying hidden patterns and correlations in this data deluge. In particular, clustering algorithms are of the utmost importance for performing exploratory data analysis to identify a set (a.k.a., cluster) of similar objects. Clustering algorithms are computationally heavy workloads and require to be executed on high-performance computing clusters, especially to deal with large datasets. This execution on HPC infrastructures is an energy hungry procedure with additional issues, such as high-latency communications or privacy. Edge computing is a paradigm to enable light-weight computations at the edge of the network that has been proposed recently to solve these issues. In this paper, we provide an in-depth analysis of emergent edge computing architectures that include low-power Graphics Processing Units (GPUs) to speed-up these workloads. Our analysis includes performance and power consumption figures of the latest Nvidia's AGX Xavier to compare the energy-performance ratio of these low-cost platforms with a high-performance cloud-based counterpart version. Three different clustering algorithms (i.e., k-means, Fuzzy Minimals (FM), and Fuzzy C-Means (FCM)) are designed to be optimally executed on edge and cloud platforms, showing a speed-up factor of up to 11× for the GPU code compared to sequential counterpart versions in the edge platforms and energy savings of up to 150% between the edge computing and HPC platforms.Entities:
Keywords: GPU computing; IoT applications; cloud computing; clustering algorithms; edge computing; intelligent systems; low-power
Year: 2020 PMID: 33172017 PMCID: PMC7664181 DOI: 10.3390/s20216335
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Main features of the targeted clustering algorithms.
| Algorithm | Soft Clustering | Nº of Cluster Pre-Fixed | Requisites |
|---|---|---|---|
| K-Means | No | Yes | CWS Clusters |
| FCM | Yes | Yes | CWS Clusters |
| FM | Yes | n.a. | none |
Figure 1The system infrastructure in a nutshell.
Figure 2Execution time (in seconds) of the clustering of the k-means (right-hand side) and Fuzzy C-Means (FCM) (left-hand side) algorithms for the three experiments described in Section 4.1. GPU and CPU versions are executed in the HPC platform.
Figure 3Execution time (in seconds) of Fuzzy Minimals (FM) algorithm for the first (a) and second (b) experiment on the HPC platform, comparing both CPU and GPU versions.
Figure 4Execution time (in seconds) for the three benchmarks for K-means and FCM algorithms on the NVIDIA AGX Xavier.
Figure 5Execution time (in seconds) for Experiment 1 and 2 for the FM algorithm on the NVIDIA AGX Xavier.
Comparison of the execution time (in seconds) of the GPU and CPU implementations of the k-means algorithm between the HPC and edge computing platforms.
| Rows | AGX Xavier | HPC Platform | Speed-Up Factor | |||
|---|---|---|---|---|---|---|
| (HPC vs. Edge) | ||||||
| CPU | GPU | CPU | GPU | CPU | GPU | |
| 100 | 0.004 | 0.007 | 0.065 | 0.035 | 0.1 | 0.2 |
| 1000 | 0.112 | 0.020 | 0.104 | 0.040 | 1.1 | 0.5 |
| 10,000 | 1.335 | 0.159 | 0.587 | 0.052 | 2.3 | 3.1 |
| 100,000 | 19.944 | 1.761 | 7.544 | 0.469 | 2.6 | 3.8 |
Comparison of the execution time (in seconds) of the GPU and CPU implementations of the FCM algorithm between the HPC and edge computing platforms.
| Rows | AGX Xavier | HPC Platform | Speed-Up | |||
|---|---|---|---|---|---|---|
| CPU | GPU | CPU | GPU | CPU | GPU | |
| 100 | 107.792 | 0.510 | 2.988 | 0.089 | 36.1 | 5.8 |
| 1000 | 28.262 | 2.246 | 1.033 | 0.093 | 27.4 | 24.2 |
| 10,000 | 44.277 | 11.584 | 1.414 | 0.479 | 31.3 | 24.2 |
| 100,000 | 329.851 | 71.835 | 8.424 | 2.876 | 39.2 | 25.0 |
Comparison of the execution time (in seconds) of the GPU and CPU implementations of the FM algorithm between the HPC and edge computing platforms.
| Rows | AGX Xavier | HPC Platform | Speed-Up | |||
|---|---|---|---|---|---|---|
| CPU | GPU | CPU | GPU | CPU | GPU | |
| 100 | 0.045 | 2.663 | 0.126 | 3.015 | 0.4 | 0.9 |
| 1000 | 5.512 | 116.918 | 0.976 | 20.840 | 5.6 | 5.6 |
| 10,000 | 735.811 | 2379.008 | 218.281 | 214.364 | 3.4 | 11.1 |
| 100,000 | 118,556.281 | 83,134.25 | 48,699.251 | 7968.036 | 2.4 | 10.4 |
Figure 6Energy consumption (in KWh) evaluation of the HPC and edge computing platform for the CUDA-based clustering implementations (i.e., k-means, FCM and FM). We focus on the GPU plugged on the HPC platform (NVIDIA GeForce RTX 2080Ti) and the whole system for AGX Xavier.