| Literature DB >> 29865210 |
Amjad Mehmood1,2, Nabil Alrajeh3, Mithun Mukherjee4, Salwani Abdullah5, Houbing Song6.
Abstract
Although wireless sensor networks (WSNs) have been the object of research focus for the past two decades, fault diagnosis in these networks has received little attention. This is an essential requirement for wireless networks, especially in WSNs, because of their ad-hoc nature, deployment requirements and resource limitations. Therefore, in this paper we survey fault diagnosis from the perspective of network operations. To the best of our knowledge, this is the first survey from such a perspective. We survey the proactive, active and passive fault diagnosis schemes that have appeared in the literature to date, accenting their advantages and limitations of each scheme. In addition to illuminating the details of past efforts, this survey also reveals new research challenges and strengthens our understanding of the field of fault diagnosis.Entities:
Keywords: active; fault diagnosis; network operation; passive; proactive; wireless sensor networks
Year: 2018 PMID: 29865210 PMCID: PMC6021939 DOI: 10.3390/s18061787
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1Fault diagnosis process.
Figure 2Fault diagnosis protocols.
Figure 3Fault diagnosis algorithm.
Figure 4Types of failure in WSNs.
Figure 5Fault detection process.
Figure 6Fault detection process.
Figure 7Fault detection flowchart.
Figure 8Fault detection approaches: (a) proactive; (b) active and (c) passive.
Figure 9Passive approach: Process of inference model following [80].
Overview of Fault Diagnosis Techniques.
| Author(s) | Year | Technique Used | Short Description |
|---|---|---|---|
| Proactive Techniques | |||
| Ping et al. [ | 2003 | Delay measurement time synchronisation (DMTS) | Flexible, lightweight and applicable on both single- and multi-hop-based networks. Takes only one clock click to synchronise nodes available in single hops and uses |
| F. Felemban et al. [ | 2005 | Probabilistic method | Suggested for improving QoS, timeliness and reliability in WSNs. |
| Elhadef et al. [ | 2006 | Probabilistic fault model | Uses several important distributions, such as Bernoulli failure distribution, gamma failure distribution and exponential failure distribution, to determine the local and global performance of the proposed scheme. Performance evaluation shows that it determines fault-free nodes successfully, even when the percentage of fault-free nodes is less than 50%. |
| You et al. [ | 2011 | Probabilistic fault model and probabilistic analysis | Suggests modeling of diagnostic algorithm operating in a cluster of nodes using probabilistic analysis of the local and global performance. |
| Shouling et al. [ | 2011 | Multi-path scheduling algorithm | Introduces the capacity of continuous data collection in dual radio multi-channel. |
| Cheng et al. [ | 2012 | Probabilistic diagnostic algorithm | Analyses negative binomial failure distribution under fault clustering. Tested against wafers. A simple structure is given as a test case to determine the status of each die. Efficient because it performs tests on all dies in parallel and, hence, saves a lot of time in finding the die through probe testing. |
| Mouradian et al. [ | 2013 | Probabilistic nature of radio link | Achieves reliability by using the probabilistic nature of the radio link. Suggests a theoretical framework based on a reference model for two types of routing schemes, such as unicast and broadcast based. |
| Banerjee et al. [ | 2014 | Vector-based fault detection model | Identifies sensor circuit fault identification using vector-based fault detection model. Performance of this protocol is better in terms of lifespan, network coverage and energy utilisation. |
| Zhao et al. [ | 2014 | Abstracted scan method | Helps users to be informed regarding the resources and applications running on the sensor nodes and helps manage sensor node activities accordingly. Additionally, performs in-network aggregation to form abstracted scans of the nodes. Specifically, it proposes the development of a residual energy scan to determine the remaining energy distribution in the network. |
| Mahapatro et al. [ | 2014 | UCR and two-state Markov model | CDFD performs online fault diagnosis by using the spatial correlation in a two-state Markov model for the good approximation of slow and fast fading, and integrates it with an unequal cluster-based routing (UCR) protocol. Without considering wireless channel impairments, it identifies both soft and hard faults. Additionally, does not impose any traffic overhead and the diagnostic messages are conveyed using routine network traffic. |
| Gupta et al. [ | 2014 | Top- | Focuses on the following two challenges:. First, it introduces a two-index structure, such as a topology index and a graph-based maximum meta-path weight index, which are both calculated offline. Second, it suggests novel top- |
| Proactive Techniques | |||
| Hayes et al. [ | 2015 | Single-hop or blind forwarding | This algorithm, PHASeR, uses robust and dynamic data routing towards the sink in mobile environments. Uses single-hop count metric or blind forwarding method to send the messages through a multi-path in the network. PHASeR is analysed mathematically on average packet delivery, throughput and packet delivery ratio. It is then evaluated against mobility, scalability and traffic loads. Recommended for a wide variety of emerging applications. |
| Chanak et al. | 2016 | Undirected graph | The main objective of proposed technique is to overcome a network failure condition in a WSN in an energy efficient manner and relay data packets from the source nodes to the BS with minimum time delay. The network conditions can effect the QoS. The network is believed to be tolerated from these failures during the data routing stage then QoS of the WSN can be maintained. |
| Active Techniques | |||
| Tolle et al. [ | 2004 | Nucleus management systems | Nucleus management system (NMS) infrastructure exports debugging and monitoring information. |
| Ramanathan et al. [ | 2005 | Debugging tool called Sympathy | Collects information about link quality or neighbour-level connectivity from sensor nodes at runtime. |
| Kim et al. [ | 2011 | Mint-route technique | Suggests debugging operations to be performed at the sink node. |
| Ruan et al. [ | 2011 | Sympathy-based approach | Performs fault detection and debugging based on the Sympathy approach. In addition, it enhances a system’s transparency and visibility. |
| Liu et al. [ | 2013 | Event-based routing structure | Carried out on an event-based routing structure in GreenOrb. GreeenOrb consists of 330 nodes and is deployed to monitor forests. Adapts to the wild environment smoothly and is an excellent platform for observing large-scale sensor networks. |
| Jiang et al. [ | 2015 | Random walk approach | Actively gathers information based on the random walk approach. It also uses a compressive sensing approach to deal with the single random walk approach. |
| Gao et al. [ | 2015 | Real-time monitoring and fault tolerance | Performs real-time monitoring, diagnosis and fault tolerance. Has the potential to become an emerging research direction for real-time fault-tolerance control and applications. |
| Geest et al. [ | 2015 | Analytical models | Suggest a simple fault detector based on the difference of synchronous detection of the machine and inverter neutral voltage, which is suitable for hardware implementation and can function as an independent observer of a drive system. |
| Wang et al. [ | 2015 | Lyapunov function | Suggests a robust, adaptive fault-tolerance consensus protocol for multi-agent systems to address unknown nonlinear dynamics and unexpected actuator faults. Although it does not directly depend on the diagnosis of the faults, it does depend on the compensation of its ultimate impact; such an impact has been reflected in part of the lumped uncertainties in the system. |
| Passive Techniques | |||
| Chen et al. [ | 2000 | Model-based technique | Based on increasing demand of dynamic systems, which insists that the systems be made more reliable and safe. Focuses on the subject of fault detection and isolation requiring more attention to become an established field of research in control engineering. Provides comprehensive material on model-based fault detection isolation (FDI). |
| Nie et al. [ | 2012 | Inference model | Determines the root cause of failures by finding the relationship between the sensed data and the failures that occurred in the network without adding any additional traffic overhead. Saves them the need for a knowledge library to take the decision instead of focusing on collecting diagnosis metrics that impose heavy traffic overhead on the network. |
| Seydou et al. [ | 2013 | Model-based techniques | Based on a fault detection model for a particular class of nonlinear systems, called flat systems, to address an original solution for a flat system in actuator fault diagnosis. |
| Liu et al. [ | 2013 | Probabilistic inference model | Does not incur additional traffic overhead for the collection of desired information. Uses a probabilistic inference model for online diagnosis of an operational WSN, which encodes dependencies existing among different network elements. |
| Zhang et al. [ | 2015 | Classification framework | Suggests a fault detection framework from the perspective of energy efficiency subject to facilitating the fault detection methods and the evaluation of their energy efficiency. A classification of fault detection approaches is provided using the same framework, which is based on several characteristics, such as energy efficiency, correlation method, evolution method and detection accuracy. |
Brief overview of important fault diagnosis terms.
| Term | Definition |
|---|---|
| Deterministic model | Uses numbers as inputs and numbers as outputs, and assumes that its outcome is certain if the inputs of the model are fixed. The output of the model is fully determined by the parameter values and the initial conditions. It provides same result for same input. |
| Non-deterministic model | Possesses some inherent randomness. The same set of parameter values and initial conditions will lead to the assembly of different outputs. Also called black-box modeling. It, on the other hand, exhibits different result for same input. |
| Stochastic model | According to probability theory, in this model the values of the parameters, measurements, expected inputs and disturbances are unpredictable because of a random variable. Thus, it can be classified as a non-deterministic model because of its random nature. This model is more informative than deterministic model due to uncertaity in varying behavioral characteristics. |
| Probabilistic model | Incorporates random variables and probability distributions into the model of an event or phenomenon and observes the system and gathers its statistics before performing any action. It estimates on the basis of historical data. |
| Hybrid model | Mixes aspects of two or more models; that is, some parameters of the deterministic model are randomly defined according to experimental observations. |
| Offline debugging | Normally preferred in in-situ network diagnosis and carried out when a failure has occurred due to sensor behavior and not strictly controlled network scale. |
| Online debugging | Starts dealing with the failure at runtime by rapid verification closure with capability to execute the design back and forward. |
| Active diagnosis | Injects some queries or probes into the network and determines or infers the quality of the network’s performance through measurement parameters. |
| Passive diagnosis | Does not collect special data for fault diagnosis in the network [ |
| Online fault diagnosis | Finds faults during system runtime. |
| Offline fault diagnosis | Collects data of the system states so that it can later perform fault analysis. Also called |
| Fault Diagnosis | Consists of detection, isolation, identification and recovery [ |
| Fault | An unusual change in, or deviation from, one or more characteristics of a system’s standard, acceptable or usual conditions [ |
| Failure | An omission in occurrence, performance, performing duty, expected actions, achieving goals or achieving objectives as prescribed. |
| Error | A system state that may cause a subsequent failure: a failure occurs when an error reaches the service interface and alters the service. A fault is the adjudged or hypothesised cause of an error. |
| Incipient fault | Method for early detection of soft faults. |
| Fault Identification | Detects whether the node is faulty or free-faulty, such as fault recovery protocol. |
| Fault Recovery | After failure detection, the recovery process is started to efficiently recover from a failure. |
| Fault Isolation | Removing the faulty nodes from the network after identification and verification. |
| Local diagnostic view | This diagnostic is created by the combination of nodes in a network. |
| Global diagnostic view | This diagnostic is created by the sink |
| Special correlation | To achieve satisfactory coverage, spatially dense sensor deployment is preferred in WSNs. Consequently, many nodes record the same information about an event. Thus, spatial correlation increases with the degree of density or decreasing inter-node separation. |
| Temporal correlation | Degree of correlation between two consecutive measurements that may vary due to the temporal variation in features of the phenomenon. Its computation fluctuates with respect to time of the data, such as time series. |
| Spatio-temporal correlation | A combination of both spatial and temporal features that brings significant advantages to the design of energy-efficient communication protocols for WSNs. |
| Static fault diagnosis | In static models, the diagnosis problem is formulated as one of maximizing the posterior probability of component states given the observed fail or pass outcomes of tests. It works under supervised learning mechanisms such as ANN. |
| Dynamic fault diagnosis | In the context of dynamic models, states components are to be evolved as independent model, in such model, at each time epoch, we have access to some of the observed test outcomes. Given the observed test outcomes at different time epochs, the goal is to determine the most likely evolution of the states of components over time. It operates under unsupervised learning techniques such as Independent Markov Chain Model. |
| Hardware redundancy | Consists of replication of computers, sensors, actuators and other components, and is used to achieve the mechanism of FDI. |
| Analytical redundancy | Also known as functional redundancy; based on mathematical model of the system being monitored. |
| Model-based fault diagnosis | It detects soft faults, such digital controller, digital filter, as as well as hardware faults, such as defective construction, actuator faults, sensor faults, abnormal parameters, external obstacles (collision, clogging). It performs the following three important tasks: (i) fault detection; (ii) fault isolation; and (iii) fault identification or analysis. |
| Robust fault detection | Capable of predicting soft, small or early faults in a system’s components before being caught by a human operator or automation system. |
| Online detection | Real-time detection [ |