| Literature DB >> 35251142 |
Fatemeh Serpush1, Mohammad Bagher Menhaj2, Behrooz Masoumi1, Babak Karasfi1.
Abstract
Human activity recognition (HAR) has been of interest in recent years due to the growing demands in many areas. Applications of HAR include healthcare systems to monitor activities of daily living (ADL) (primarily due to the rapidly growing population of the elderly), security environments for automatic recognition of abnormal activities to notify the relevant authorities, and improve human interaction with the computer. HAR research can be classified according to the data acquisition tools (sensors or cameras), methods (handcrafted methods or deep learning methods), and the complexity of the activity. In the healthcare system, HAR based on wearable sensors is a new technology that consists of three essential parts worth examining: the location of the wearable sensor, data preprocessing (feature calculation, extraction, and selection), and the recognition methods. This survey aims to examine all aspects of HAR based on wearable sensors, thus analyzing the applications, challenges, datasets, approaches, and components. It also provides coherent categorizations, purposeful comparisons, and systematic architecture. Then, this paper performs qualitative evaluations by criteria considered in this system on the approaches and makes available comprehensive reviews of the HAR system. Therefore, this survey is more extensive and coherent than recent surveys in this field.Entities:
Mesh:
Year: 2022 PMID: 35251142 PMCID: PMC8894054 DOI: 10.1155/2022/1391906
Source DB: PubMed Journal: Comput Intell Neurosci
Abbreviations and symbols.
| Abbreviations and symbols | Description | Abbreviations and symbols | Description |
|---|---|---|---|
| 3D | Three dimensions | ANN | Artificial neural network |
| ABW | Activity-based windowing | BSS | Blind source separation |
| ADL | Activities of daily living | CRF | Conditional random field |
| AFE | Analogue front end | DLC | Deep learning-based classification |
| CCA | Canonical correlation analysis | DLS | Deep learning-based semisupervised model |
| CFS | DLF | Depp learning-based features | |
| CNN | Convolutional neural network | DBN | Dynamic Bayesian network |
| CPD | Point change detection | EM | Expectation-maximization |
| CSS | Contact switch sensors | FA | Factor analysis |
| DBN | Deep belief network | FP | False positives |
| DFT | Discrete Fourier transform | FN | The number of false negatives |
| DL | Deep learning | GMM | Gaussian mixture model |
| DT | Decision tree | ICA | Independent component analysis |
| HAR | Human activity recognition | LS | Least squares |
| HARS | Human activity recognition system | NB | Naïve Bayes |
| HMM | Hidden Markov model | RF | Random forest |
| IMU | Gyroscope, accelerometers, and magnetic sensors | RBF | Time complexity in modeling |
| KNN | K-nearest neighbor | RBM | Restricted Boltzmann machine |
| LDA | Linear discriminant analysis | SBHAR | Smartphone-based HAR |
| L-SSW | Last-state sensor windowing | TCM | Time complexity in modeling |
| LSTM | Long short-term memory | Radial basis function | TCR time complexity in recognition |
| MEMS | Microelectromechanical systems |
| The ratio of class |
| Mhealth | Mobile health | F | Freight gate |
| NN | Neural network |
| Input, output, and forget gates considered in time |
| PCA | Principal component analysis | h (all) | Hidden values |
| PI | Passive infrared | Recalli | Sample ratio of class |
| PN | Number of participants | K | Kernel function |
| PWM | Pulse width modulation | N | The total number of all samples |
| QDA | Quadratic discriminant analysis | Precisioni | The ratio of an instance of class |
| REALDISP | REAListic sensor DISPlacement |
| Bias vectors |
| RFID | Radio frequency identification |
| Cell output at the previous time stage |
| RNN | Recurrent neural network |
| Matrixes of weight: |
| STEW | Sensor dependency extension windowing |
| The state of memory at time t |
| SDW | Sensor-dependent windowing | O | Output gate |
| SEW | Sensor event-based windowing | I | Input gate |
| SHCS | Smart healthcare system | C | Cell activation vectors |
| SVM | Support vector machine |
| The number of samples in |
| TBW | Time-based windowing |
| Input to the memory cell layer at time |
| TP | The number of true positives | All | Non-linear functions |
| TSW | Time slice-based windowing |
Comparison of significant recent research work (surveys).
| ID | References | Architecture for HAR | Challenges of classification of HAR | Approaches of HAR | Quality evaluation of approaches | Dataset analysis | Sensor system | Sensor types | Application classification | Number of tables | Number of figures | HARS component classification | Analysis of every component with table | Discussion |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | [ | No | No | No | No | No | Yes | Yes | No | 2 | 9 | No | No | No |
| 2 | [ | Yes | No | Yes | No | No | No | Yes | Yes | 10 | 3 | Yes | Some components | No |
| 3 | [ | Yes | Yes | Yes | No | Ye | No | Yes | Yes | 5 | 2 | No | No | Yes |
| 4 | [ | No | No | Yes | No | No | No | Yes | No | 6 | 1 | No | Yes | Yes |
| 5 | [ | Yes | Yes | Yes | No | Yes | No | No | Yes | 11 | 6 | Yes | Yes | No |
| 6 | [ | No | Yes | Yes | No | Yes | No | No | No | 4 | 2 | No | Some components | Yes |
| 7 | [ | No | No | Yes | No | Yes | Yes | Yes | Yes | 13 | 6 | No | Some components | Yes |
| 8 |
| Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | 7 | 13 | Yes | Yes | Yes |
Figure 1Categorization for HAR applications.
Figure 2HAR in SHCS for monitoring the elderly.
Figure 3Proposed classification to recognize human activity in the SHCS.
Figure 4Categorization of data acquisition tools.
Figure 5HARS' main component categorization.
Figure 6Different common locations for wearable sensors [22, 23, 43].
Figure 7Diagram of a wireless wearable sensor block [10].
Figure 8Sample of raw data of the wearable sensor.
Figure 9Overview of a smart shirt and ring sensor [10].
Details of the most popular sensor-based activity recognition datasets.
| Dataset | PN | Channel number | Sensors | Frequency | Activities | ||
|---|---|---|---|---|---|---|---|
| Type | Number | ||||||
|
| 4 | 113 | Commercial RS485-networked XSense inertial measurement units (IMUs) | 5 | 30 Hz |
| |
| Commercial InertiaCube3 inertial sensors | 2 | ||||||
| Bluetooth acceleration sensors | 12 | ||||||
| 3D-accelerometer | 1 | ||||||
| 3D-gyroscope | 1 | ||||||
| 3D magnetic | 1 | ||||||
|
| — | — | Galaxy smartphone: three-axial linear acceleration and three-axial angular velocity | 2 | 50 Hz | 12 daily activities, namely, three static activities (standing, sitting, and lying), three dynamic activities (walking, going upstairs, and going downstairs), and the switch of 3 static activities (standing-sitting, sitting-standing, standing-lying, lying-sitting, standing-lying, lying-standing) | |
|
| 23 | — | Each sensor includes three-axis accelerometer and three-axis gyroscope | 3 | — | Walking, sitting, standing, and so on | |
|
| 9 | — | Colibri wireless inertial measurement units (IMUs) | 3 | 100 Hz | 16 activities | |
| Accelerometer | 1 | ||||||
| Heart rate monitor | 1 | ||||||
| Gyroscope | 1 | ||||||
| Magnetic | 1 | ||||||
|
| |||||||
|
| 30 | — | Smartphone | Gyroscope | 1 | 50 Hz | Three static activities such as standing, sitting, and lying down and three dynamic activities as walking, walking-upstairs, and walking-downstairs |
| Accelerometer | 2 | ||||||
|
| |||||||
|
| 10 | — | Accelerometer | 1 | 50 Hz | Standing still, sitting and relaxing, lying down, walking, climbing stairs, bending waist forward, front arm elevation, knee bending, cycling, jogging, running, and jumping front and back | |
| Gyroscope | 1 | ||||||
| Magnetic | 1 | ||||||
|
| 29 | — | Mobile phone: accelerometer | 1 | 20 Hz | Sitting, jogging, standing, upstairs, downstairs, and walking | |
|
| |||||||
|
| 17 | — | 9 IMUs | 3D-accelerometer | 1 | 40 Hz | 33 fitness activities |
| 3D-gyroscope | 1 | ||||||
| 3D-magnetometer | 1 | ||||||
| 4D-quaternion | 1 | ||||||
|
| 57 | — | Smartphone | 3D-accelerometer | 1 | 20 Hz | Nine different types of ADLs: standing, walking, jogging, jumping, stairs up, stairs down, sit chair, car step in, car step out, and four different types of falls: forward-lying, front-knees-lying, sideward-lying, and back-sitting-chair |
| 3D-gyroscope | 1 | ||||||
| 3D-orientation sensors | 1 | ||||||
Figure 10Technologies for sending data.
Figure 11TSW and SEW methods in the preprocessing stage [33].
Analysis of various windowing methods for HAR.
| Methods | Idea | Advantage | Disadvantage |
|---|---|---|---|
|
| The data stream of events is divided into windows at activity change detection points | (i) Suitable for labeling data. | (i) Failure in activity recognition correctly |
|
| |||
|
| Event data streams are divided into windows with fixed time intervals | (i) The simplicity of implementation. | (i) Choosing the right window length |
|
| |||
|
| The data are split into windows with the same number of sensor events, and the results of window time lengths vary from window to window | (i) This approach offers computational benefits over ABW. | (i) There may be a significant time interval between an event and the previous event |
|
| |||
|
| The mutual information of the two sensors described earlier depends on the order in which a pair of sensors occurs in the entire data stream | (i) Uses multiple sensors to increase detection accuracy. | (i) The possibility of losing some dependence between the sensors |
|
| |||
|
| In a window specified by the Ai event sensor, a sensor can be activated several times | (i) Sometimes, the latest sensor status, according to ei, can be more descriptive than the frequency at which it occurs in a window. | (i) There may be a significant time gap between an event and previous events |
Figure 12Data stream segmentation (methods) [27].
Analysis of some well-known feature extraction methods of HARS.
| Method | Idea | Advantage | Disadvantage |
|---|---|---|---|
|
| It is a linear method and consists of converting the main features (generally interdependent) into new features that are not interdependent and depend on the data's scale. | (i) Returns the main features to a low-dimensional space. | (i) The principal components are not always easy to interpret. |
|
| |||
|
| Features extracted through linear conversion to find the variables' linear composition, which is the best representation of the data. | (i) Minimizes changes within the class relative to principal component analysis. | (i) Relies on a complex model containing the correct number of components. |
|
| |||
|
| This method finds independent components such as main features expressed as a linear combination of components. | (i) Solution to solve the problem of blind source separation. | (i) Suitable for non-Gaussian data. |
|
| |||
|
| The main features can be grouped according to their correlation. | (i) The features of each group are strongly correlated. | (i) Investigates the factors and finds the most effective ones. |
|
| |||
|
| The salient features of the raw sensor data can be extracted automatically, without relying on handcrafted features. | (i) Ability to automatically learn from unauthorized and, in some cases, unlabeled raw sensor data. | (i) Searches for optimal solutions. |
Analysis of proposed methods for classification and activity recognition based on wearable sensors.
| Approach | Disadvantage | Advantage | Idea | Method |
|---|---|---|---|---|
|
| (i) High calculation time in assigning a new instance to the class. | (i) Relatively high classification accuracy. | The principle of similarity between the training set and new examples is used to classify. The latest instance is assigned to the respective class by a majority vote of its closest neighbors. |
|
| (i) Has two classes. | (i) Linear separation in the specified space. | This method uses statistical learning theory that maximizes the margin between the separator and the data. |
| |
| (i) The direct impact of the selected feature set on accuracy. | (i) They have an excellent computational performance. | The DT uses static features of time series data and focuses on the sliding window. |
| |
| (i) Needs a lot of labeled data to achieve good performance. | (i) Improves the performance of the DT. | Random forests contain a combination of decision trees and are based on the majority vote of each tree's different decisions. |
| |
| (i) Challenges in the data collection phase. | (i) Noise injection is provided to improve activity detection models. | This method's idea is that to create general recognition models for e-health, a small main dataset is used, and the area covered by the dataset is expanded using noise. |
| |
| (i) Lack of details about the seemingly desirable parameters. | (i) Provides high-level abstraction models in the data. | Deep learning has emerged as a learning model branch, creating a deep multilayered architecture for automated feature design. |
| |
| (i) Saves only one step before. | (i) Compatible with variable-length input. | Includes non-linear units with internal modes that can learn dynamic temporal behavior from a continuous input with arbitrary length. |
| |
| (i) High complexity of the model. | (i) Traceable learning. | At each step, the memory's content from the first layer contains differentiating information that describes the person's movement and past changes in his activity. Over time, cells learn to output, overwrite, or ignore their internal memory based on the current input and past state history, resulting in a system capable of storing information in hundreds of steps. |
| |
| (i) A difficult balance between learning rate and learning accuracy. | (i) Better performance than the perceptron. | As with the convection method, a set of matrix surface samples is first generated. Then, the average of the samples' signals in each matrix is used as the DBN input. |
| |
| (i) Processing units in the CNN need to be used. Length of temporal dimensions. | (i) Learned features have more power. | It is based on a deep architecture and contains at least one temporal convolutional layer, one pooling layer, and one fully connected layer before a classifier. |
| |
|
| (i) Convergence is not guaranteed in many cases. Dependence on the initial evaluation of EM algorithm. | (i) Suitable for detecting most activities. | A probabilistic method is generally used in unsupervised classification that uses the Gaussian component total weight density. |
|
| (i) Poor performance in cluster overlap | (i) Reduces the size of the total variance distortion within the cluster as a cost function. | An unsupervised classification method is known for clustering |
| |
| (i) Poor performance in cluster overlap. | (i) A dynamic method. | A Markov chain expresses a discrete-time random process involving a limited number of states whose current state depends on the former. In the case of HAR, each activity is represented by a mode. |
| |
|
| ||||
|
| (i) It is difficult to analyze because it is a wrapper algorithm. | (i) Limited cost for labeling. | It is a wrapper algorithm that frequently uses a supervised learning method. A supervised classifier is training for the first time, with a small amount of labeled data. |
|
| (i) The need for data samples that should be described by two subsets which are sufficient and redundant. | (i) An excellent approach to using unlabeled data to improve learning efficiency. | This method follows the process of repeated self-training. Simultaneously, the goal is to improve by strengthening the training process with one more source of information. |
| |
| (i) There is not always an identifiable composite distribution that can help build the generative model. | (i) Detects missing data for the classification problem. | The core of the generative model for semisupervised learning is large amounts of unlabeled data to identify composite components. Then, unlabeled data for each class are sufficient to determine the compositional distribution fully. |
| |
| (i) High learning cost. | (i) Can control unlabeled and labeled data points. | Most well-known deep learning methods, such as CNN and LSTM, conceived the generative and discriminator models. It is not surprising to know that they can learn directly from unlabeled data. |
| |
Figure 13CNN layers for HAR [76].
Qualitative comparison of macro-HAR methods with some of the mentioned criteria.
| Methods | Accuracy | TCM | TCR | Generalization | ||
|---|---|---|---|---|---|---|
|
|
| Informed-SVM | Medium | Low | Low | High |
| DT | Low | Medium | Medium | Low | ||
| QDA | High | Low | Low | Low | ||
| KNN | High | – | High | High | ||
| RF | Medium | High | High | Medium | ||
|
| K-means | Low | – | Very High | High | |
|
| ||||||
|
|
| DBN | High | High | Medium | High |
| HMM | Medium | Medium | Medium | Medium | ||
| GMM | Low | High | Medium | Medium | ||
|
| Uninformed-SVM | High | High | Low | High | |
| RNN | High | High | Medium | Medium | ||
| LSTM | Very High | Very High | High | High | ||
| CNN | High | High | Medium | High | ||