| Literature DB >> 32375400 |
Roman Tkachenko1, Ivan Izonin1, Natalia Kryvinska2,3, Ivanna Dronyuk4, Khrystyna Zub5.
Abstract
The purpose of this paper is to improve the accuracy of solving prediction tasks of the missing IoT data recovery. To achieve this, the authors have developed a new ensemble of neural network tools. It consists of two successive General Regression Neural Network (GRNN) networks and one neural-like structure of the Successive Geometric Transformation Model (SGTM). The principle of ensemble topology construction on two successively connected general regression neural networks, supplemented with an SGTM neural-like structure, is mathematically substantiated, which improves the accuracy of prediction results. The effectiveness of the method is based on the replacement of the summation of the results of the two GRNNs with a weighted summation, which improves the accuracy of the ensemble operation in general. A detailed algorithmic implementation of the ensemble method as well as a flowchart of its operation is presented. The parameters of the ensemble operation are determined by optimization using the brute-force method. Based on the developed ensemble method, the solution of the task of completing the partially missing values in the real monitoring dataset of the air environment collected by the IoT device is presented. By comparing the performance of the developed ensemble with the existing methods, the highest accuracy of its performance (by the parameters of Mean Absolute Percentage Error (MAPE) and Root Mean Squared Error (RMSE) accuracy) among the most similar in this class has been proved.Entities:
Keywords: ANN techniques; GRNN; IoT sensors; Successive Geometric Transformation Model; data imputation; hybrid systems; missing data; neural-like structures; non-iterative training; weighted summation
Year: 2020 PMID: 32375400 PMCID: PMC7249176 DOI: 10.3390/s20092625
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Reasons for the omission in data collected by IoT devices.
| Reasons | Investigations |
|---|---|
| the unstable network communication, synchronization problems, unreliable sensor devices, environmental factors, and other device malfunctions; | [ |
| the interruption of the data acquisition in long-term monitoring scenarios; | [ |
| the location, firmware may not be consistent across locations. This could mean differences in reporting frequency or formatting of values; | [ |
| the sensor failures, monitoring system failures or network failures; | [ |
| the storage errors, unreliable IoT devices, unstable network status; | [ |
| the incorrect response or nonresponse of the IoT-based sensors; | [ |
| the collision of the nodes when the information passes from sender to receiver; | [ |
| the channel effects and mobility of the end-devices; | [ |
| the errors in data collection and transmission; | [ |
| the data integration from different sources into a unified schema; | [ |
| the lack of battery power, communication errors, and malfunctioning devices. | [ |
Figure 1General Regression Neural Network (GRNN) topology.
Figure 2Topology of additional correction linear neural-like structure of the Successive Geometric Transformation Model (SGTM).
Figure 3Flowchart of the GRNN–SGTM ensemble for solving the stated task.
The main characteristics of the Internet of Things (IoT)-based dataset.
| Variable | MEAN Value | MAX Value | MIN Value | Chemical Nomenclature |
|---|---|---|---|---|
| Tungsten monoxide | 817.0748 | 2683 | 322 | WO |
| Tungsten dioxide | 1452.494 | 2775 | 551 | WO2 |
| Titanium | 958.2302 | 2214 | 390 | Ti |
| Temperature | 17.75942 | 44.6 | 0.1 | T |
| Relative humidity | 48.90163 | 88.7 | 9.2 | RH |
| Non-methane hydrocarbons | 1119.626 | 2040 | 647 | SnO2 |
| Nitrogen monoxide | 250.465 | 1479 | 2 | NO |
| Nitrogen dioxide | 113.7894 | 333 | 2 | NO2 |
| Indium oxide | 1057.363 | 2523 | 221 | InO |
| Carbon monoxide | 2.19059 | 11.9 | 0.1 | CO |
| Benzene | 10.54635 | 63.7 | 0.2 | C6H6 |
| Absolute humidity | 0.986315 | 2.2345 | 0.1847 | AH |
Figure 4Root Mean Square Error (RMSE)-values under different combinations of smooth factors та of both GRNN ensemble networks: (a) in the training mode and (b) in the application mode.
Figure 5Mean Absolute Percentage Error (MAPE)-values under different combinations of smooth factors та of both GRNN ensemble networks: (a) in the training mode and (b) in the application mode.
Optimal parameter of proposed ensemble operation.
|
|
| MAPE, % | RMSE |
|---|---|---|---|
| 0.23 | 0.05 | 20.268 (train mode) | 0.493 (train mode) |
| 18.828 (test mode) | 0.458 (test mode) |
Comparison of operation accuracy of all the methods investigated.
| Method | Parameters | RMSE | MAPE, % |
|---|---|---|---|
| GRNN [ | input neurons = 11, | 0.464 | 19.856 |
| Extended-inputs GRNN [ | input neurons = 78, | 0.549 | 19.905 |
| SGTM neural-like structure (test mode) [ | input neurons = 11, | 0.497 | 20.491 |
| Extended-input SGTM neural-like structure (test mode) [ | input neurons = 78, | 0.458 | 19.911 |
| GRNN-SGTM ensemble (test mode) | parameters are given above in the text | 0.458 | 18.828 |