| Literature DB >> 35270880 |
Athina Tsanousa1, Evangelos Bektsis1, Constantine Kyriakopoulos1, Ana Gómez González2, Urko Leturiondo2, Ilias Gialampoukidis1, Anastasios Karakostas1, Stefanos Vrochidis1, Ioannis Kompatsiaris1.
Abstract
Manufacturing companies increasingly become "smarter" as a result of the Industry 4.0 revolution. Multiple sensors are used for industrial monitoring of machines and workers in order to detect events and consequently improve the manufacturing processes, lower the respective costs, and increase safety. Multisensor systems produce big amounts of heterogeneous data. Data fusion techniques address the issue of multimodality by combining data from different sources and improving the results of monitoring systems. The current paper presents a detailed review of state-of-the-art data fusion solutions, on data storage and indexing from various types of sensors, feature engineering, and multimodal data integration. The review aims to serve as a guide for the early stages of an analytic pipeline of manufacturing prognosis. The reviewed literature showed that in fusion and in preprocessing, the methods chosen to be applied in this sector are beyond the state-of-the-art. Existing weaknesses and gaps that lead to future research goals were also identified.Entities:
Keywords: data fusion; feature extraction; industrial prognosis; smart manufacturing
Mesh:
Year: 2022 PMID: 35270880 PMCID: PMC8914726 DOI: 10.3390/s22051734
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1Flowchart that presents the typical analytic pipeline followed in manufacturing applications.
Applications of software tools for data management.
| Software Tool | Application | Reference |
|---|---|---|
| Apache Hadoop | Hadoop is a framework that allows for the distributed processing of large datasets across clusters of computers using simple programming models. It has been used for different kinds of applications such as frameworks that can optimise and organise the way bit data can be searched and accessed. There are also applications regarding storing data derived from sensors that monitor the environmental air pollution. Lastly, tuning systems have been designed to improve the performance of Hadoop and MapReduce. | [ |
| Apache Storm | One of the most capable software solutions for Big Data is Apache Storm. Several applications exist that employ it. Some of them use it as a data streaming and real-time processing platform, while others create frameworks for dynamically scaling for the analysis of streaming data. Finally, there are multisensor data fusion frameworks that employ Apache Storm due to its high reliability and good processing mode. | [ |
| Apache Flume | Flume is a distributed, reliable, and available system for efficiently collecting, aggregating, and moving large amounts of event data. It has been used for various kinds of applications such as healthcare and manufacturing. Frameworks have been designed so that the computational scalability of sensor network data can be achieved. | [ |
| Apache Spark | The aim of Spark is to make data analytics programs run faster by offering a general execution model that optimises arbitrary operator graphs and supports in-memory computing. Most applications use it for sensor analytics. It has been deployed on both industrial and non-industrial applications and can be integrated into pre-existing frameworks. | [ |
| Apache Kafka | Kafka is well suited for the situations where users need to process real-time data and analyse them. There are papers that focused on learning how to reliably transfer data and studied its application in collaboration with other software solutions. | [ |
Summarised features according to the domain.
| Domain | Features | Reference |
|---|---|---|
| Time domain | Mean, maximum, minimum, amplitude, variance, standard deviation, skewness, kurtosis, root mean square, peak-to-peak, autoregressive coefficients, overshoot, settling time, rise time | [ |
| Frequency domain | Spectral statistic moments, highest peak amplitude, sum of peak amplitudes | [ |
| Time-frequency domain | Wavelet energy, Wigner–Ville | [ |
| Two-dimensional domain | RGB, LUV, HSV, HMMD, homogeneity, entropy, contrast, correlation, body shape, length, width, height | [ |
Fusion environments.
| Environment | Subfield | Reference |
|---|---|---|
| Distributed | Data sensing | [ |
| Kalman filtering | [ | |
| Energy/cost efficiency | [ | |
| Heterogeneous | Data correlation | [ |
| Distributed filtering | [ | |
| Heterogeneous data | [ | |
| Fussy logic/Kalman filter | [ | |
| Canonical correlation analysis | [ | |
| Multimodal fusion | [ | |
| Non-linear | Multisensor data fusion | [ |
| Sensor-dense IoT networks | [ | |
| Fusion based on fuzzy logic | [ | |
| Object-tracking | Assembly line | [ |
| Transportation network | [ | |
| Online multi-object tracking | [ | |
| Energy efficiency for target tracking | [ |
Summary of the fusion methods.
| Fusion Level | Fusion Method | Reference |
|---|---|---|
| Feature | Information theory | [ |
| Feature/late | Bayesian-based fusion | [ |
| Feature | D(C)NNs | [ |
| Feature | Feature elimination/concatenation | [ |
| Late | Dempster–Schafer theory | [ |
| Feature | Random forest based | [ |