| Literature DB >> 36016040 |
Kyandoghere Kyamakya1, Vahid Tavakkoli1, Simon McClatchie2, Maximilian Arbeiter2, Bart G Scholte van Mast2.
Abstract
Currently, abnormality detection and/or prediction is a very hot topic. In this paper, we addressed it in the frame of activity monitoring of a human in bed. This paper presents a comprehensive formulation of a requirements engineering dossier for a monitoring system of a "human in bed" for abnormal behavior detection and forecasting. Hereby, practical and real-world constraints and concerns were identified and taken into consideration in the requirements dossier. A comprehensive and holistic discussion of the anomaly concept was extensively conducted and contributed to laying the ground for a realistic specifications book of the anomaly detection system. Some systems engineering relevant issues were also briefly addressed, e.g., verification and validation. A structured critical review of the relevant literature led to identifying four major approaches of interest. These four approaches were evaluated from the perspective of the requirements dossier. It was thereby clearly demonstrated that the approach integrating graph networks and advanced deep-learning schemes (Graph-DL) is the one capable of fully fulfilling the challenging issues expressed in the real-world conditions aware specification book. Nevertheless, to meet immediate market needs, systems based on advanced statistical methods, after a series of adaptations, already ensure and satisfy the important requirements related to, e.g., low cost, solid data security and a fully embedded and self-sufficient implementation. To conclude, some recommendations regarding system architecture and overall systems engineering were formulated.Entities:
Keywords: abnormal behavior detection and forecasting; activity monitoring of “humans in bed”; anomaly detection and prediction; behavior evolution identification and system adaptivity; comprehensive anomaly concept definition; comprehensive critical state-of-the-art review; early warning capability; explainability and interpretability of detected anomalies; real-world context-aware and practically realistic specification book; scenario analysis; selected use-cases; system architecture and system engineering related recommendations; uncertainty modeling; user perspectives of practical interest; verification and validation
Mesh:
Year: 2022 PMID: 36016040 PMCID: PMC9414192 DOI: 10.3390/s22166279
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.847
Figure 1Human in bed monitoring through signals generated by sensors placed under the bed. Under each of the four bed legs, two sensors are placed: one weight sensor and one motion sensor. Thus, eight sensor measurement values are generated continuously over time for further processing by the anomaly detection intelligent system [25].
Figure 2There is a huge variety of possible static and/or dynamical activities of a human in bed, which are monitored through a sensor system such as the one presented in Figure 1. The intelligent system to process the sensor data is capable of detecting and, eventually, also predict normal activities and abnormal ones. (Source of the different image parts: Freepik).
Figure 3Presentation of the four major data-processing levels that were used in the comprehensive ontological framework for defining the general “anomaly” concept.
Overview of all anomaly patterns that may be observed/detected w.r.t. a given simple or global attribute of an entity belonging to any of the four highest levels of the data-processing hierarchy of Figure 3.
| Anomaly Pattern Name | Point Observation vs. Burst Observation Versus Interval Observation | Related Illustrative Graphical Illustration | Remarks and Other Useful Hints and/or Considerations |
|---|---|---|---|
| Outlier | Point observation |
| It is generally a stochastically described irregularity. Indeed, probability densities are calculated for target parameters, and defined percentiles are declared as outliers and thereby as anomalies of a certain degree. |
| Collective anomaly | Burst observation |
| Collective anomalies are data points that are considered anomalies when viewed with other data points against the rest of the data set. |
| Contextual anomaly | Point observation |
| Context anomalies are data points that are considered abnormal when viewed against meta-information associated with the data points. |
| Missing signal/data anomaly | Time interval observation |
| Here, most of the data are missing, and the time/frequency response is zero. |
| Minor data/ | Time interval observation |
| Here, compared to the ground truth values, the observed amplitude is very small. |
| Multiple outlier pattern | Time interval observation |
| Here, one or more appear in the observed data. |
| Square pattern | Time interval observation |
| Here, the time response oscillates within a limited range, such as a square signal. |
| Trend pattern | Time interval observation |
| Here, the observed data/signal has an obvious non-stationary and monotonous trend. |
| Drift data pattern | Time interval observation |
| Here, the observed data signal is non-stationary with a random drift. |
Figure 4A possible and useful comprehensive structuring of the time dimension into regions and sub-regions. The use-case engineer shall fix meaningful durations/lengths of the different sub-regions.
Illustration of the time case (i.e., relative timing situation between detection and occurrence of the element to be assessed) perspective; see Figure 4.
| Time of the | Location (in the Time Dimension) of the Element to Be | Time Case Labelling | Remarks and Eventual Comments |
|---|---|---|---|
| NOW (see | Sub-region 1/N | Online detection | Here, the processing speed of the detection algorithm is critical. This depends on the computing infrastructure, namely embedded systems or processing in the cloud. |
| NOW (see | Sub-region 2/N | Near to online detection (Time case 2/b) | |
| NOW (see | Sub-region 3/N | Late online detection | |
| NOW (see | Sub-region 1/P | Closest posterior detection | This may be a new re-assessment of those past elements. This may be necessary in view of the time-varying system dynamics of the human under observation. It is theoretically possible that events that were perceived normal become later perceived abnormal after the system dynamics have evolved, and vice-versa. |
| NOW (see | Sub-region 2/P | Close posterior detection | |
| NOW (see | Sub-region 3/P | Late posterior detection | |
| NOW (see | Sub-region 1/F | Nearest anterior detection | Here, one is looking into the future. In case one can see anomalies in that future, one has then a case of early warning: from a “close” early warning up to a “far” early warning situation. |
| NOW (see | Sub-region 2/F | Close anterior detection | |
| NOW (see | Sub-region 3/F | Far anterior detection |
Temporal view of the novelty patterns—some illustrative examples (non-exhaustive).
| Novelty Pattern Name and ID | Related Illustrative Graphical Illustration | Remarks |
|---|---|---|
| NOV 1: |
| A sudden drift is characterized by an abrupt change in the underlying process (the one guiding the occurrence of anomalies over the time window observed). |
| NOV 2: |
| A gradual drift happens over time, and observations from one or more processes may be observed with changing frequency. |
| NOV 3: |
| The incremental drift also happens over time but also involves one or more processes. |
| NOV 4: |
| The reoccurring concept/behavior describes processes that cease to exist at one point and time but reappear later in time. |
Selected advanced characteristics of a robust and real-world mature abnormality detection system in the monitoring of humans in bed.
| Selected Advanced Characteristics and Their Naming | Importance of a Real-World Mature Monitoring System | Remarks |
|---|---|---|
| Very high | Continual learning can be understood as a concept of learning a model for a large number of “tasks” sequentially without forgetting knowledge obtained from the preceding tasks, whereby the data in/of the old tasks are not available anymore while training new ones. The learning relates to either the behavior of an individual person currently under monitoring or to the behavior of an individual person within the contextual background of the behaviors of several other persons (a group of persons) who have been priorly monitored. | |
| Very high | After the novelty has been identified, the model adapts to the confirmed behavior change. This means essentially adjusting to change over time. | |
| Very high | Anomaly detection in the case of monitoring a human in bed requires a high level of trust in its results. A key to this trust is the ability to assess the uncertainty of the computed results appropriately. | |
| Very high | See Subtask E described in | |
| Very high | Since the USP are practically of high relevance, related reconfigurability of the intelligent system is needed. | |
| Very high | Since the OCP are practically of high relevance, related reconfigurability of the intelligent system is needed. | |
| Very high/MUST | To avoid the lack of interpretability, this characteristic is needed. | |
| Very high | To avoid the lack of interpretability, this characteristic is needed. | |
| Very high | To avoid the lack of interpretability, this characteristic is needed. | |
| Very high/MUST | To avoid the lack of interpretability, this characteristic is needed. | |
| Very high/MUST | The trained personnel that operates the advanced intelligent monitoring system can, through an appropriate human–machine interface, confirm or inform some of the predictions/detections. Alternatively, self-learning triggered reinforcement learning can also be used. | |
| Very high/MUST | For most technical systems, the acceptable tolerance level w.r.t. key system parameters is very important and very sensitive from a practical point of view. Indeed, in real-world applications and practice, in general, a low tolerance margin may result in a significantly much more expensive system. A bigger tolerance is thus resulting in a more interesting cost/benefit ratio. It is evident that the abnormality detection endeavor can therefore not ignore the tolerance level dimension. This is especially very sensitive in view of two critical facts related to the human monitoring scenario: (a) the sensor data obtained from the Level 0 of the architecture shown in | |
| VERY HIGH/MUST | This is one practical requirement expressed by P.SYS. Essentially, data imperfection have several faces: (a) sensors related imperfections such as low update rate, low accuracy, noise and/or bias in the data, signal-related faults, etc.; (b) sensor’s drifts and/or other nonlinear, eventually time-varying disturbing phenomena; (c) data size related imperfections (this may be related to the effective (short) duration of the data recordings (i.e., observations and/or to the number of human samples involved, etc.; (d) involving only one of maximum two sensor modes or types (e.g., solely piezo-electric sensor and vibration sensors). | |
| HIGH | This requirement ensures that the intelligent system is low cost, has a relatively low power consumption, is application-scenario-dependently real-time capable and can operate almost self-sufficiently without involving remote computing infrastructure(s) and/or data. | |
| VERY HIGH/MUST | This requirement also integrates, additionally, the fast detection of behavior change. It factually complements REQ 2. |
A selected collection, just for illustrative) of performance evaluation metrics (METs), which can be relevant for a comprehensive verification and validation of an anomaly detection-and-prediction system developed for the monitoring of a human in bed.
| Metric Name and ID | Metric Description | Remarks |
|---|---|---|
| MET 1: Accuracy | It is simply defined as the mean squared error (MSE) between the model’s predictions and the target values. | Although this metric is named “accuracy”, it is actually a measure of error, and a low value is desired. |
| MET 2: Self-sensitivity | For self-associative empirical models, a robust model does/shall produce small changes in all of its outputs for (in the face of) small errors in the (model) inputs. | The self-sensitivity is a measure of an empirical model’s ability to make correct anomaly predictions when the respective anomaly-related score value is incorrect due to some sort of uncertainty (or fault). |
| MET 3: Cross-sensitivity | Cross-sensitivity measures the effect a faulty (model) input has on the other (model) predictions. | |
| MET 4: Anomaly detectability | This metric help to determine the smallest drift (in the relevant input data values of the detection system) that can be identified. Therefore, this anomaly detection performance metric is used to determine the smallest process parameter change that can be detected. | |
| MET 5: Precision | The precision answers the question: “What proportion of identified anomalies are true anomalies?” | This is a classical metric |
| MET 6: Recall | The recall is used to answer the question: “What proportion of true anomalies was identified?” | This is a classical metric |
| MET 7: F1 Score | The F1 score identifies the overall performance of the anomaly detection model by combining both recall and precision, using the harmonic mean. | This is a classical metric |
Discussion of comprehensive main seven reasons [48] that make graph-based approaches to anomaly detection vital and necessary.
| Reason Supporting the Use of Graph Networks | Explanation | How Far Is It Relevant for Our Target Context of Monitoring a Human in Bed |
|---|---|---|
| Strong inter-dependence between entities and data | Data objects are often related to each other and exhibit dependencies. In fact, most relational data can be thought of as inter-dependent, which necessitates accounting for related objects in finding anomalies. | Highly relevant |
| Powerful representation ability | Graphs naturally represent the inter-dependencies by the introduction of links (or edges) between the related objects. The multiple paths lying between these related objects effectively capture their long-range correlations. Moreover, a graph representation facilitates the representation of rich datasets enabling the incorporation of node and edge attributes/types. | Highly relevant |
| The relational nature of problem domains | The nature of anomalies could exhibit itself as relational. An illustration example can be given from the performance monitoring domain, where the failure of a machine could cause the malfunction of the machines dependent on it. Similarly, the failure of a machine could be a good indicator of the possible other failures of machines in close spatial proximity to it (e.g., due to an excessive increase in humidity in that particular region of a warehouse). | Highly relevant |
| Graphs are a robust machinery | One could argue that graphs serve as more adversarial robust tools. For example, in fraud detection systems, behavioral clues such as log-in times and locations (e.g., IP addresses) can be easily altered or faked by advanced fraudsters. On the other hand, it may be reasonable to argue that the fraudsters could not have a global view of the entire network (e.g., money transfer, telecommunication, email, review network) that they are operating. As such, it would be harder for a fraudster to fit into this network as good as possible without | Highly relevant |
| Dynamic Graphs offer unique capabilities for anomaly detection | The anomaly detection in dynamic graphs can be based on the following situations: feature-based events, decomposition-based events, community or clustering-based events and window-based events. | Highly relevant(for example, see some of the OCPs) |
| Strong graph-based anomaly description capability | This is underscored by the following capabilities that are well documented in the relevant literature: (a) interpretation-friendly graph anomaly detection; (b) interactive graph querying and sense-making. | Highly relevant |
| Several proven application examples of graph-based anomaly detection in highly complex real-world applications | Following applications examples can be found in the relevant literature: anomalies in telecom networks; anomalies in opinion networks; anomalies in auction networks; anomalies in the web network; anomalies in account networks; anomalies in social networks; anomalies in security networks; anomalies in computer networks; anomalies in financial networks | Highly relevant |
Figure 5For illustration, Toy Examples—A comparison of “Conventional Anomaly Detection” (see part (a)) and “Graph Anomaly Detection” (see part (b)). Apart from anomalies shown in part (b) of the figure, graph anomaly detection also identifies graph-level anomalies.
A comprehensive overview of the four major approaches or paradigms for anomaly detection and/or forecasting, which are also an application for the target context of “monitoring the physical activity of a human in bed”.
| Paradigm- | Core Quintessence of the Paradigm | Selected Representative Related Works Related to Anomaly Detection and/or Prediction |
|---|---|---|
| MAJA 1: | A good representative of these methods are the so-called hidden Markov models (HMM). HMMs are statistical models to capture hidden information from observable sequential symbols/values. In an HMM, the system being modeled is assumed to be a Markov process with unknown parameters, and the challenge thereby is to determine the hidden parameters from the observable parameters. HMMs are sequence models. Thus, given a sequence of inputs, an HMM computes a sequence of outputs of the same length. An HMM model is a graph where nodes are probability distributions over labels, and edges give the probability of transitioning from one node to the other. Together, these can be used to compute the probability of a label sequence given the input sequence. By using HMM, it is possible to predict future states based on the current observations as well as the sequence of states from an observed sequence. For a process under observation over time, the possible states, which are hidden parameters, are generally “normal”, “abnormal” and “critical”. | Forkan et al. (2014) [ |
| MAJA 2: | Deep-learning (DL) concepts use complex neural networks for modeling time series and are thereby capable of detecting and/or predicting anomalies. DL models are very good at modeling the “temporal context” of a dynamically evolving system. | A sufficiently exhaustive inventory of schemes within this family of (DL-based) methods is provided in the survey by Kukjin Choi et al. (2021) [ |
| MAJA 3: | Data objects representing a scene like the one in | A sufficiently exhaustive inventory of schemes within this family of (graph-based) methods is provided in the survey by Leman Akoglu et al. (2015) [ |
| MAJA 4: | A very significant change in the last years is that graph anomaly detection (see MAJA 3) has evolved from relying heavily on the domain knowledge of human experts towards rather machine learning techniques that eliminate human intervention and, more recently, various deep learning technologies. These deep learning techniques are not only capable of identifying potential anomalies in graphs far more accurately than ever before, but they can also do so in real-time. Consequently, MAJA 4 can be viewed as a synergetic merging of MAJA 3 and MAJA 2. Moreover, certain bricks of MAJA 1 and of fuzzy logic may be easily casually integrated (e.g., Bayesian networks, certain HMM architectures, etc.). | A sufficiently exhaustive inventory of schemes within this family of (graph DL-based) methods is provided in the survey by Xiaoxiao Ma et al. (2021) [ |
A deep and comparative analysis of how far the four major approaches are capable or not of modeling and address the tough specifications expressed in the requirements dossier of the target context of “monitoring a human in bed”.
| Requirement ID and Description | Capability of MAJA 1 | Capability of MAJA 2 | Capability of MAJA 3 | Capability of MAJA 4 | Remarks |
|---|---|---|---|---|---|
| REQ 1: Self-learning and continual learning capability, for either single individuals or groups of individuals | Possible but relatively/ | Possible | Possible but relatively limited | Possible | All approaches can handle this requirement |
| REQ 2: Identification of and adaptivity to novelty/evolution | Possible but relatively/ | Possible [ | Possible but relatively/ | Possible (a) [ | All approaches can handle these requirements, whereby DL involving ones are better |
| REQ 3: Comprehensive uncertainty model/assessment for all subtasks (A–E) described in | Not possible | Not possible | Possible but relatively/ | Possible [ | Only graph-based approaches can handle this requirement |
| REQ 4: Prediction capability of the system status at levels 1 to 4 of | Possible but relatively/ | Possible | Possible but relatively/ | Possible [ | All approaches can handle these requirements, whereby DL involving ones are better |
| REQ 5: Reconfigurability w.r.t. USPs (user-specific perspectives) | Not possible | Not possible | Eventually | Possible [ | Only graph-based approaches can handle this requirement |
| REQ 6: Reconfigurability w.r.t. OCPs (observation context perspectives) | Not possible | Not possible | Eventually | Possible [ | Only graph-based approaches can handle this requirement |
| REQ 7: Explainability [ | Not possible | Eventually | Eventually | Possible [ | Only graph and/or DL-based approaches can handle this requirement |
| REQ 8: Explainability of the anomaly detection (considering USPs and OCPs) | Not possible | Not possible | Eventually | Possible | Only graph and/or DL-based approaches can handle this requirement |
| REQ 9: Explainability of the evolution detection | Not possible | Eventually | Eventually | Possible | Only graph and/or DL-based approaches can handle this requirement |
| REQ 10: Explainability of the uncertainty grade or confidence level for all subtasks (A–E) described in | Not possible | Eventually | Eventually | Possible | Only graph and/or DL-based approaches can handle this requirement |
| REQ 11: The possibility of a performance tuning/improvement through either partial human assistance (via some form of feedback) or evolutive/reinforced learning or involving artificially generated data out of some reliable “generative adversarial” process or a combination of some or all of the above. | Not possible | Eventually | Eventually | Possible [ | Only graph and/or DL-based approaches can handle this requirement |
| REQ 12: The reconfigurability w.r.t. the tolerance level/grade/ | Not possible | Eventually | Eventually | Possible | Only graph and/or DL-based approaches can handle this requirement |
| REQ 13: Tolerance to non-ideal training data | Possible (however, some adaptations may be necessary) | Possible (however, some adaptations (e.g., in the form of pre-processing layers or dataset augmentations) may be necessary) | Eventually possible but after adaptations | Surely possible | Almost all approaches can handle this requirement, although adaptations, which may be very substantial, may be necessary |
| REQ 14: The complete intelligent system shall be capable of running fully on COTS embedded platforms (i.e., the intelligent system shall be fully Embedded AI), which are essentially low-computing power; this is to ensure both low cost and data security while satisfying use-case specific real-time processing deadlines. | Possible (however, some adaptations may be necessary) | Possible (however, some significant architecture and pipeline adaptations may be necessary) | Eventually possible | Eventually possible (however, some significant architecture and pipeline adaptations may be necessary) | For this requirement, the approaches involving DL are not superior to the intelligent system version that is “fully embedded”. For this system version, MAJA 1 is potentially superior. However, the situation significantly changes for the case of an intelligent system that can/does involve IoT and related infrastructure such as Cloud Computing |
| REQ 15: Short learning duration and/or fast detection of/and adaptation to behavior changes | Possible (however, some adaptations may be necessary) | Possible (however, some significant architecture and pipeline adaptations may be necessary) | Possible (however, some significant architecture and pipeline adaptations may be necessary) | Possible (however, some significant architecture and pipeline adaptations may be necessary). | The performance metric MET 4 (see |