| Literature DB >> 32937993 |
Klaus Kammerer1, Rüdiger Pryss2, Burkhard Hoppenstedt1, Kevin Sommer3, Manfred Reichert1.
Abstract
For machine manufacturing companies, besides the production of high quality and reliable machines, requirements have emerged to maintain machine-related aspects through digital services. The development of such services in the field of the Industrial Internet of Things (IIoT) is dealing with solutions such as effective condition monitoring and predictive maintenance. However, appropriate data sources are needed on which digital services can be technically based. As many powerful and cheap sensors have been introduced over the last years, their integration into complex machines is promising for developing digital services for various scenarios. It is apparent that for components handling recorded data of these sensors they must usually deal with large amounts of data. In particular, the labeling of raw sensor data must be furthered by a technical solution. To deal with these data handling challenges in a generic way, a sensor processing pipeline (SPP) was developed, which provides effective methods to capture, process, store, and visualize raw sensor data based on a processing chain. Based on the example of a machine manufacturing company, the SPP approach is presented in this work. For the company involved, the approach has revealed promising results.Entities:
Keywords: cyber-physical systems; data stream processing; processing pipeline; sensor networks
Year: 2020 PMID: 32937993 PMCID: PMC7570670 DOI: 10.3390/s20185245
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1Signals of a compacting station.
Sensor data processing requirements.
| SPP | Apache Spark | Apache Flink | Apache Storm | Dunkel et al. | Borealis | LWDF | FastFlow | KNIME | Schweppe et al. | |
|---|---|---|---|---|---|---|---|---|---|---|
| R1: Stream processing | X | X | X | X | X | X | X | X | - | X |
| R2: Variability support (machine/sensor variants) | X | (X) | (X) | X | ? | ? | (X) | (X) | (X) | (X) |
| R3: Support for (non-)equidistant data streams | X | X | X | X | ? | ? | - | X | X | ? |
| R4: Data model transformations | X | X | X | X | ? | ? | - | X | X | X |
| R5: Gap detection/interpolation | X/X | X/X | X/X | X/X | ?/? | X/X | X/- | X/X | X/X | -/- |
| R6: Parallel processing | X | X | X | X | X | X | - | X | X | X |
| R7: Flexible time bases for sensor data | X | - | - | - | - | - | - | - | - | - |
| R8: Grouping of streams | X | X | X | X | - | ? | - | - | - | - |
| R9: Time synchronization of signals in a group | X (relative, absolute | - | - | - | - | - | - | - | - | - |
| R10: Windowing based on machine speed | X | - | - | - | - | - | - | - | - | - |
| R11: Windowing trigger definition usable for multiple streams | X | - | - | - | n/a | - | - | - | - | - |
| R12: Commodity hardware support | X | - | - | - | n/a | (X) | X | X | - | X |
X = supported, (X) = partly supported, - = not supported, n/a = not applicable, ? = no information available; SPP = sensor processing pipeline, LWDF = lightweight dataflow, KNIME = Konstanz Information Miner; stream processing is under development; pipeline for a set of variants; query per variant; via TopologyBuilder; flow per variant; pipeline per variant; with BTTM—see Section 5.1.1; windowing adaption across multiple streams calculated based on other streams (correlated windowing).
Overview of discussed aspects and goals.
| Discussed Aspects | |
|---|---|
|
|
|
| Uhlmann Pac-Systeme GmbH and Co. KG | The case study presented in |
| Programmable logic controller | Acts as a data source for production systems. |
| Time management in PLCs | Precise clock synchronization across different systems is required to correlate distributed data. |
| Information flow processing | To be able to use a large amount of sensor data in downstream systems, they must be converted, normalized and interpreted into standardized data models. |
| Data stream windows | PLCs provide continuous data streams. In order to be able to correlate and analyze them, these data streams must be divided into individual data packets with the help of windowing. This procedure is not trivial and requires application-specific customization. |
|
| |
|
|
|
| Context-aware Process Management | In order to enable business processes to process and interpret events generated by business processes, custom concepts are required for the integration and processing of these events within process aware information systems. |
| Context-aware Process Injection | Cyber-physical systems are subject to constant state changes, which can also have an impact on business processes. In order to implement changes in already running process instances, we introduce the concept of context-aware process injection (CaPI). |
Figure 2An Uhlmann pharmaceutical packaging line with its production sections.
Figure 3Communication schema.
Figure 4Information flow processing schema, adopted from [9].
Figure 5Window types and sliding window concept.
Figure 6Excerpt of a machine maintenance process model (simplified).
Figure 7Schematic overview of context-aware process execution framework.
Figure 8Schematic Overview of the sensor processing pipeline (SPP).
Figure 9BTTM frames.
Figure 10BTTM group state diagram.
Figure 11Examples of BTTM transmissions.
Figure 12Processing pipeline concept.
Figure 13Schema of a SPP node.
Figure 14Overview of the sensor processing pipeline.
Figure 15Correlated period of two sensor signals.
Figure 16Data point synchronization.
Data Types and their Properties.
| Data Type (Schema) | Volume | Search Criteria | Retention |
|---|---|---|---|
| Sensor Data (fixed) | high | timestamp | flexible |
| Reference Data (flexible) | low | freely selectable | flexible |
| Results Data (flexible) | medium | timestamp | flexible |
| Pipeline Configuration (fixed) | low | none | permanent |
Figure 17Schedule controller.
Figure 18User interface of the SPP data visualizer.
General Aspects.
| SPP | Apache Spark | Apache Flink | Apache Storm | Borealis | FastFlow | KNIME | Schweppe et al. | |
|---|---|---|---|---|---|---|---|---|
| Intended use | Sensor data processing of synchronized PLC-based machines/systems | Framework for cluster computing | Streamprocessor- framework | Streamprocessor- framework | Streamprocessor- framework | C++ parallel programming framework | Data analytics platform | Automotive diagnosis data analysis (OBD + CAN) |
| Target runtime environment (R12) | Commodity hardware | Distributed datacenter environment (Hadoop cluster) | Distributed datacenter environment | Distributed datacenter environment | Commodity/ datacenter | Commodity | Commodity/ datacenter | Embedded controller |
| Required RAM (R12) | 512 MB (.NET Framework) + 50 MB (SPP) + node implementations + processing data = 562 MB++ | Single: 8 GB (recommended) + node implementations + processing data | Single: 4 GB (minimum), 8 GB (recommended) + node implementations + processing data | Single: 8 GB (recommended) Cluster: 2 Nimbus nodes (2 × 8 GB), 2 worker nodes (2 × 8 GB) = 32 GB++ (recommended) | 2 GB per node | ? | 8 GB (recommended) | < 128 MB |
Functional Requirements.
| SPP | Apache Spark | Apache Flink | Apache Storm | Borealis | FastFlow | KNIME | Schweppe et al. | |
|---|---|---|---|---|---|---|---|---|
|
| Hybrid | Hybrid | Hybrid | Stream | Stream | Stream | Batch 1 | Stream |
| Stream processing (R1 and SR1) | X | X | X (via DataStream API) | X | X | X | - | X |
| Stream processing type | Native | Micro-batching | Native | Native | Native | Native | n/a | Native |
| Batch processing | X (via stream replaying) | X | X (via DataSet API) | - | - | - | X | - |
| X = supported, (X) = partly supported, - = not supported, n/a = not applicable, ? = no information available | ||||||||
| SPP = sensor processing pipeline, LWDF = LightWeight DataFlow, KNIME = KoNstanz Information MinEr | ||||||||
|
| ||||||||
| Execution model described as directed acyclic graph | X | X | X | X | - (implicit via QueryProcessor) | X | X | X |
| Explicit execution model | X (pipeline) | - | - | X (via TopologyBuilder) | - (implicit via QueryProcessor) | X (blocks, pipelines, farms) | X (pipeline) | X (stream processing graph) |
| Dedicated data processing | X (source, sink) | X (input, output) | X (source, sink) | X (spouts) | X | X (input, output) | X (reader, writer) | X (source, sink) |
| Data queuing to decouple data generation and processing | X (SPSC FIFO) | X | X | X | ? | X (SPSC FIFO) | X | ? |
| Query language support (SR2) | - | (X) | X (via Table API) | X | X (SQuAl) | - | (X) (SQL) | X (DeviceSQL) |
| Query planning | n/a | X | ? | ? | X | n/a | - | ? |
|
| ||||||||
| Processing components | X (nodes, stateful/stateless) | X | X (nodes, stateful/stateless) | X (bolts, stateless) | X (boxes, stateless) | X (worker, stateful/stateless) | X (nodes, stateless) | X (innernodes, stateless) |
| Datastream transformations (R4) | map, filter, partition, reduce, aggregate, windowing, split (stream), join (window) | map, filter, transform, repartition, reduce, union (stream), windowing, split (stream), join (stream) | map, filter, partition, reduce, fold, aggregate, windowing, union (stream), split (stream), join (window, interval), physical partitioning | map, filter, repartition, reduce, fold, aggregate, windowing, branch, union (stream), split (stream) | map, filter, union, bsort, aggregate, join (window), resample | map, filter, reduce, stencil, farm, split, join, ... | map, filter, reduce, fold, aggregate, branch, union, split | filter, windowing |
| Gap detection/ interpolation (R5) | X/X | X/X | X/X | X/X | X/X | X/X | X/X | -/- |
| Serialization required between Processing Nodes | - | - | - | - | n/a | - | X (conversion to data table) | - |
| State management | stateful/stateless | stateless | stateful/stateless | stateless/ stateful (via Trident) | stateless | stateful/stateless | stateful | stateless |
| Stream meta-information | X | X | X | X | ? | - | n/a | ? |
|
| ||||||||
| Windowing support | X (sliding, tumbling) | X (sliding, tumbling) | X (sliding, tumbling, global 2) | X (sliding, tumbling) | X (sliding, tumbling) | X (sliding, tumbling) | - | X (sliding, tumbling) |
| Runtime windowing adaption (R10) | X (based on other streams, i.e., correlating) | - | - | - | - | - | n/a | - |
| Window trigger | time (flexible), count (flexible), correlating | time (fixed), count (flexible) | time (fixed), count (flexible), session | time (fixed), count (fixed), session | time (fixed), count (flexible), session | time (fixed), count (flexible) | n/a | time, count |
| Explicit window trigger nodes (EWTN) | X | - | X | (X) (via TriggerPolicy) | X (distinct box types) | - | n/a | - |
| EWTN for different streams (R11) 3 | X | - | - | - | - | - | n/a | - |
| Flexible time base for windowing/ processing (R7) | X (e.g., relative TSC, absolute RTC, IEEE1588 ns precision) | - (absolute unix_timestamp or duration, | - (absolute unix_timestamp or duration, ms precision) | - (absolute duration, ms precision) | - | - | n/a | - |
| Allowed lateness 4 | - | - | X | - | - | - | n/a | - |
|
| ||||||||
| Custom processing | X (custom nodes) | X (User-defined Functions UDF) | X (functions, operators) | X (via Trident) | - (8 operators) | ? | X (repository with over 1500 nodes) | ? |
| Parameterizable pipeline templates (R2) | X | - | - | - | - | X | X | X |
| Contextual data (SR5) | X (Key-value) | (X) (Streaming Context) | X (Tables) | X (Trident) | ? | ? | X | - |
| TSC = Time Stamp Counter, RTC = Real Time Clock | ||||||||
1 Stream processing is under development; 2 Assigns data frames with the same id to the same window, must be used with window trigger; 3 Enables to use a window trigger for multiple window generators for different streams; 4 Data belonging to a window is not dropped, if the former arrives after window generation.
Non-Functional Requirements.
| SPP | Apache Spark | Apache Flink | Apache Storm | Borealis | FastFlow | KNIME | Schweppe et al. | |
|---|---|---|---|---|---|---|---|---|
| Robustness (SR4) | X | X | X | X | X | X | X | ? |
| Low latency (SR8) | X (when not applying compute-intensive algorithms in the asynchronous processing part) | no true streaming, not suitable for low latency | X | X | ? | X | - | X |
| Delivery guarantees | exactly once | exactly once | exactly once, atleast once 1 | atleast once | atleast once | exactly once | n/a | exactly once |
| Error correction (SR3) | X (detection and correction/ interpolation of missing frames) | (X) (manual implementation) | X | X (detection and correction/ interpolation of missing frames) | X | (X) (manual implementation) | - | ? |
| Extensibility | Data sources, processing, data sinks | data sources, processing, data sinks | Data Sources, processing, data sinks | data sources, processing, data sinks | ? | X | X | ? |
| Availability (SR6) 2 | - | X (distributed setup) | X (distributed setup) | X (distributed setup) | X (distributed setup) | - | - | - |
| Scalability (SR7) | - (vertically) | X (horizontally) | X (horizontally) | X (horizontally) | X (horizontally) | - (vertically) | X (horizontally via Hadoop) | - (vertically) |
1 Configurable—dataflows with parallel streaming operations, e.g., map, flatMap, filter, are "exactly once," even in "atleast once" mode; 2 Operational system monitoring, measurement, and management.