| Literature DB >> 32365645 |
Massimo Merenda1,2, Carlo Porcaro1,2, Demetrio Iero1,2.
Abstract
In a few years, the world will be populated by billions of connected devices that will be placed in our homes, cities, vehicles, and industries. Devices with limited resources will interact with the surrounding environment and users. Many of these devices will be based on machine learning models to decode meaning and behavior behind sensors' data, to implement accurate predictions and make decisions. The bottleneck will be the high level of connected things that could congest the network. Hence, the need to incorporate intelligence on end devices using machine learning algorithms. Deploying machine learning on such edge devices improves the network congestion by allowing computations to be performed close to the data sources. The aim of this work is to provide a review of the main techniques that guarantee the execution of machine learning models on hardware with low performances in the Internet of Things paradigm, paving the way to the Internet of Conscious Things. In this work, a detailed review on models, architecture, and requirements on solutions that implement edge machine learning on Internet of Things devices is presented, with the main goal to define the state of the art and envisioning development requirements. Furthermore, an example of edge machine learning implementation on a microcontroller will be provided, commonly regarded as the machine learning "Hello World".Entities:
Keywords: Internet of Things; artificial intelligence; deep learning; edge devices; machine learning
Year: 2020 PMID: 32365645 PMCID: PMC7273223 DOI: 10.3390/s20092533
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1Edge computing interest (Google Trends).
Figure 2Deep Neural Network (DNN) example.
Figure 3(a) Hyperplane that separate two classes of data, (b) kernel trick.
Figure 4(a) On-device computation, (b) edge server-based architectures, and (c) joint computation.
Figure 5Pruning effect on the network.
Hardware used for Internet of Things (IoT) devices that implement edge computing.
| Work | DNN Model | Application | End Devices | Key Metrics |
|---|---|---|---|---|
| This work ( | CNN | Image Recognition | STM32F401RE | fast inference |
| [ | SVM | Image Recognition | Raspberry Pi model 3 | fast inference |
| [ | DNN | Distributed Computing | Raspberry Pi model 3 | hierarchical |
| [ | SVM, CNN | Video Analysis | Raspberry Pi model 3 | fast inference |
| [ | SVM | Video Analysis | Raspberry Pi model 3 | fast inference |
| [ | SVM | Battery Lifetime Estimation | SPHERE | energy |
| [ | CNN | Image Recognition, Sensor Fusion | Motorola 68HC11 | fast inference |
| [ | SVM | Code execution | ARM® v7 | accuracy |
| [ | Logistic Regression | Human Activity Recognition | ESP32 | accuracy |
| [ | CNN | Speech Recognition | Sparkfun Edge | accuracy |
Main communication technologies used in IoT.
| Group | Technology | Data Rate | Distance (Indoor/Outdoor) | Works |
|---|---|---|---|---|
| Contactless | NFC | 424 kbps | 0–4 cm | [ |
| Contactless | RFID | 640 kbps | 10–20 m | [ |
| LPWAN | LoRa | 0.3 to 50 kbps | 5–10 km | [ |
| LPWAN | SigFox | 100 or 600 bps | 30–50km | [ |
| WPAN | Zigbee | 250 kbps | 10–100 m | [ |
| WPAN | Z-Wave | 100 kbps | 100 m | [ |
| WPAN | Bluetooth LE | 1 Mbps | 10 m/50 m | [ |
| WPAN | Bluetooth 5 | 2 Mbps | 40 m/200 m | [ |
| WPAN | ANT | 60 kbps | 30 m | [ |
| WiFi | IEEE 802.11n | 600 Mbps | 70 m/250 m | [ |
| WiFi | IEEE 802.11ax | 9600 Mbps | 30 m/120 m | [ |
| WiFi | IEEE 802.11af | 570 Mbps | 280 m/1 km | [ |
| WiFi | IEEE 802.11ah | 347 Mbps | 140 m/500 m | [ |
| Cellular | NB-IoT | 200 kbps | 280 m/1 km | [ |
| Cellular | LTE-M1 | 1 Mbps | 5–100 km | [ |
| Cellular | 4G/LTE | 150 Mbps | 15 km | [ |
| Cellular | 5G | 10–50 Gbps | 2 km | [ |
Figure 6Joint computation among devices, edge, and cloud servers.
Artificial Intelligence (AI) accelerator devices that implement edge computing.
| Work | DNN Model | Application | End Devices |
|---|---|---|---|
| [ | SVM/CNN | Image and Video Analysis | Movidius |
| [ | CNN | Image and Video Analysis, Robotics | Jetson TX1 |
| [ | YOLO [ | Image Recognition, Robotics | Jetson TX2 |
| [ | AlexNet | Image Classification | Nvidia Tegra K1 |
| [ | CNN | Image Analysis | Neuflow |
| [ | CNN, DNN | Image Recognition | DianNao |
| [ | CNN | Vision Processing | ShiDianNao |
Accuracy for different activation functions.
| First Level | Second Level | Accuracy on Test |
|---|---|---|
| relu | relu | 96.20% |
| tanh | tanh | 96.80% |
| sigmoid | sigmoid | 96.96% |
| relu | tanh | 97.18% |
| tanh | relu | 96.64% |
| sigmoid | relu | 96.88% |
| relu | sigmoid | 97.25% |
| tanh | sigmoid | 97.21% |
| sigmoid | tanh | 97.10% |
Model outline.
| Layer (Type) | Output Shape | Param # |
|---|---|---|
| conv2d_1 (Conv2D) | (None, 26, 26, 32) | 320 |
| conv2d_2 (Conv2D) | (None, 24, 24, 64) | 18496 |
| max_pooling2d_1 (MaxPooling2) | (None, 12, 12, 64) | 0 |
| dropout_1 (Dropout) | (None, 12, 12, 64) | 0 |
| flatten_1 (Flatten) | (None, 9216) | 0 |
| dense_l (Dense) | (None, 64) | 589888 |
| dropout_2 (Dropout) | (None, 64) | 0 |
| dense_2 (Dense) | (None, 10) | 650 |
#: Total parameters, 609,354; trainable parameters, 609,354; nontrainable parameters, 0.
Figure 7Digit “6” drawn by the user.
Figure 8STMicrolectronics NUCLEO-F746ZG.
Prediction of model on hardware.
| Name | RAM | FLASH | Complexity |
|---|---|---|---|
| Network | 135.68 kBytes | 668.97 kBytes | 11497654 MAC |
Figure 9Inference result showing the recognition of the digit “6” drawn by the user (accuracy = 100.00%, root-mean-square error (rmse) = 0.0000, medium average error (mae) = 0.0000, 10 classes, 1 sample).
Figure 10Inference details.
Time contribution of each layer.
| Description | Shape | ms |
|---|---|---|
| 10004/(2D Convolutional) | (26, 26, 32) | 9.328 |
| 10011/(Merged Conv2d/Pool) | (12, 12, 64) | 299.524 |
| 10005/(Dense) | (1, 1, 64) | 19.562 |
| 10009/(Nonlinearity) | (1, 1, 64) | 0.006 |
| 10005/(Dense) | (1, 1, 10) | 0.022 |
| 10009/(Nonlinearity) | (1, 1, 10) | 0.014 |
| 328.458 (total) |