| Literature DB >> 32230986 |
Tianqi Lv1, Xiaojuan Wang1, Lei Jin1, Yabo Xiao1, Mei Song1.
Abstract
Human activity recognition (HAR) is a popular and challenging research topic, driven by a variety of applications. More recently, with significant progress in the development of deep learning networks for classification tasks, many researchers have made use of such models to recognise human activities in a sensor-based manner, which have achieved good performance. However, sensor-based HAR still faces challenges; in particular, recognising similar activities that only have a different sequentiality and similarly classifying activities with large inter-personal variability. This means that some human activities have large intra-class scatter and small inter-class separation. To deal with this problem, we introduce a margin mechanism to enhance the discriminative power of deep learning networks. We modified four kinds of common neural networks with our margin mechanism to test the effectiveness of our proposed method. The experimental results demonstrate that the margin-based models outperform the unmodified models on the OPPORTUNITY, UniMiB-SHAR, and PAMAP2 datasets. We also extend our research to the problem of open-set human activity recognition and evaluate the proposed method's performance in recognising new human activities.Entities:
Keywords: deep learning; human activity recognition; margin mechanism; open-set classification
Mesh:
Year: 2020 PMID: 32230986 PMCID: PMC7181274 DOI: 10.3390/s20071871
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1Architecture of margin-based deep learning networks.
Figure 2Framework for open-set human activity recognition.
Figure 3Examples of recognising an activity in open-set human activity recognition.
Settings of the models used on the OPPORTUNITY, UniMiB-SHAR, and PAMAP2 datasets. 1 and 2 indicate the number of LSTM cells used in the OPPORTUNITY and PAMAP2 datasets, respectively.
| OPPORTUNITY | UniMiB-SHAR | |||
|---|---|---|---|---|
| Model | Parameter | Value | Parameter | Value |
| MLP | Neurons in fully-connected layers 1, 2, and 3 | 2000 | Neurons in fully-connected layers 1, 2, and 3 | 6000 |
| CNN | Convolutional kernel size for blocks 1, 2, and 3 | (11,1), (10,1), (6,1) | Convolutional kernel size for block 1 | (32,3) |
| Convolutional sliding stride for blocks 1, 2, and 3 | (1,1), (1,1), (1,1) | Convolutional sliding stride for block 1 | (1,1) | |
| Convolutional kernels for blocks 1, 2, and 3 | 50, 40, 30 | Convolutional kernels for block 1 | 100 | |
| Pooling sizes for blocks 1, 2, and 3 | (2,1), (3,1), (1,1) | Pooling sizes for block 1 | (2,1) | |
| Neurons in fully-connected layer | 1000 | Neurons in fully-connected layer | 6000 | |
| LSTM | LSTM cells in layers 1 and 2 | 64 | LSTM cells in layers 1 and 2 | 151, 151 |
| Output dimensions of LSTM cells in layers 1 and 2 | 600, 600 | Output dimensions of LSTM cells in layers 1 and 2 | 1000, 1000 | |
| Neurons in fully-connected layer | 512 | Neurons in fully-connected layer | 6000 | |
| Hybrid | Convolutional kernel size for block 1 | (11,1) | Convolutional kernel size for block 1 | (32,3) |
| Convolutional sliding stride for block 1 | (1,1) | Convolutional sliding stride for block 1 | (1,1) | |
| Convolutional kernels for block 1 | 50 | Convolutional kernels for block 1 | 100 | |
| Pooling sizes for block 1 | (2,1) | Pooling sizes for block 1 | (2,1) | |
| LSTM cells in layers 1 and 2 | 27 | LSTM cells in layers 1 and 2 | 60, 60 | |
| Output dimensions of LSTM cells in layers 1 and 2 | 600, 600 | Output dimensions of LSTM cells in layers 1 and 2 | 1000, 1000 | |
| Neurons in fully-connected layer | 512 | Neurons in fully-connected layer | 6000 | |
Classification performance results (in percent) of the various models under the OPPORTUNITY, UniMiB-SHAR, and PAMAP2 datasets. ’-M’ represents models utilising an arcmargin layer.
| OPPORTUNITY | UniMiB-SHAR | PAMAP2 | |||||||
|---|---|---|---|---|---|---|---|---|---|
| Method |
|
|
|
|
|
|
|
|
|
| HC [ | 89.96 | 89.53 | 63.76 | 32.01 | 22.85 | 13.78 | - | - | - |
| CBH [ | 89.66 | 88.99 | 62.27 | 75.21 | 74.13 | 60.01 | - | - | - |
| CBS [ | 90.22 | 89.88 | 67.50 | 77.03 | 75.93 | 63.23 | - | - | - |
| AE [ | 87.80 | 87.60 | 55.62 | 65.67 | 64.84 | 55.04 | - | - | - |
| MLP [ | 91.11 | 90.86 | 68.17 | 71.62 | 70.81 | 59.97 | 82.63 | 80.83 | 72.92 |
| CNN [ | 90.58 | 90.19 | 65.26 | 74.97 | 74.29 | 64.65 | 91.51 | 91.35 | 83.34 |
| LSTM [ | 91.29 | 91.16 | 69.71 | 71.47 | 70.82 | 59.32 | 84.00 | 82.71 | 74.00 |
| Hybrid [ | 91.76 | 91.56 | 70.86 | 74.63 | 73.65 | 64.47 | 85.12 | 83.73 | 76.10 |
| MLP-M | 91.28 | 91.03 | 68.09 | 73.94 | 73.55 | 61.59 | 82.47 | 82.09 | 74.43 |
| CNN-M | 90.88 | 90.47 | 66.85 | 74.86 | 74.42 | 63.30 |
|
| 92.95 |
| LSTM-M |
|
| 70.45 | 74.17 | 72.93 | 59.43 | 86.00 | 84.60 | 83.75 |
| Hybrid-M | 91.92 | 91.87 |
|
|
|
| 93.52 | 93.52 |
|
Figure 4(a) The F1-score (in percent) of each class of different models on the PAMAP2 dataset. (b) Confusion matrix of each model on the PAMAP2 dataset. The horizontal and vertical axes represent the predicted and true classes, respectively. 1, lying; 2, sitting; 3, standing; 4, walking; 5, running; 6, cycling; 7, Nordic walking; 8, ascending stairs; 9, descending stairs; 10, vacuum cleaning; 11, ironing.
Classification performance results (in percent) of three machine learning classifiers on the OPPORTUNITY dataset. ’-DF’ and ’-DF-M’ mean that the features used to train the three classifiers were obtained from the LSTM model and the LSTM-M model, respectively.
| Method |
|
|
|
|---|---|---|---|
| SVM | 89.96 | 89.53 | 63.76 |
| Random Forest | 89.21 | 87.08 | 52.45 |
| Naive Bayes | 44.79 | 52.61 | 32.81 |
| SVM-DF | 91.81 | 91.62 | 70.24 |
| Random Forest-DF | 91.84 | 91.63 | 70.24 |
| Naive Bayes-DF | 91.15 | 91.29 | 69.03 |
| SVM-DF-M | 91.88 | 91.62 | 70.43 |
| Random Forest-DF-M | 91.93 | 91.64 | 70.42 |
| Naive Bayes-DF-M | 91.68 | 91.62 | 70.08 |
| LSTM-M |
|
|
|
Classification performance results (in percent) using different sliding window lengths on the OPPORTUNITY dataset.
| T = 32 | T = 64 | T = 96 | |||||||
|---|---|---|---|---|---|---|---|---|---|
| Method |
|
|
|
|
|
|
|
|
|
| MLP [ | 90.79 | 90.40 | 66.33 | 91.11 | 90.86 | 68.17 | 90.94 | 90.65 | 66.37 |
| CNN [ | 90.34 | 89.71 | 62.10 | 90.58 | 90.19 | 65.26 | 90.38 | 89.98 | 63.38 |
| LSTM [ | 90.88 | 90.60 | 67.20 | 91.29 | 91.16 | 69.71 | 91.33 | 91.21 | 68.64 |
| Hybrid [ | 91.10 | 90.75 | 67.31 | 91.76 | 91.56 | 70.86 | 91.44 | 91.25 | 69.04 |
| MLP-M | 91.13 | 90.77 | 66.80 | 91.28 | 91.03 | 68.09 | 91.34 | 91.10 | 67.42 |
| CNN-M | 89.97 | 89.87 | 64.20 | 90.88 | 90.47 | 66.85 | 91.47 | 91.16 | 67.78 |
| LSTM-M | 91.34 | 91.10 | 68.52 | 92.30 | 91.99 | 70.45 | 92.02 | 91.93 | 71.72 |
| Hybrid-M | 92.06 | 91.77 | 71.36 | 91.92 | 91.87 | 71.08 | 92.45 | 92.22 | 71.03 |
Figure 5Classification performance results (in percent) using different sliding window lengths on the OPPORTUNITY dataset.
The average F1-score (in percent) using different numbers of sensor channels on the OPPORTUNITY dataset.
| Method | 20 | 50 | 80 | 107 |
|---|---|---|---|---|
| MLP [ | 39.29 | 62.68 | 65.82 | 68.17 |
| CNN [ | 38.47 | 57.08 | 63.23 | 65.26 |
| LSTM [ | 41.89 | 62.23 | 67.36 | 69.71 |
| Hybrid [ | 46.11 | 63.70 | 68.79 | 70.86 |
| MLP-M | 40.16 | 63.09 | 66.87 | 68.09 |
| CNN-M | 40.33 | 60.07 | 65.47 | 66.85 |
| LSTM-M | 42.72 | 65.80 | 70.78 | 70.45 |
| Hybrid-M |
|
|
|
|
Figure 6The average F1-score (in percent) using different numbers of sensor channels on the OPPORTUNITY dataset.
Figure 7The weighted F1-score (in percent) of the margin-based models for different margin values with the OPPORTUNITY dataset.
Figure 8Heatmaps of cosine similarity of all classes using the CNN model on the PAMAP2 testing dataset: (a) softmax loss; and (b) additive angular margin loss.
Figure 9Testing softmax and arcmargin losses on the PAMAP2 validation dataset with 2D features. In this experiment, we used a CNN model to learn 2D features on the validation set of the PAMAP2 dataset. To realise this, we set the output dimension of the first fully-connected layer to 2. The first and second rows are the features in Euclidean space and angular space, respectively. Dots of different colours represent the features of different classes.
Figure 10The F1-score (in percent) of each class in the PAMAP2 dataset evaluated by the CNN-M model for open-set human activity recognition. Among, the horizontal axis represents the number of chosen seed. The two lines represent the center and cluster methods. The markpoints of different lines represent the highest F1-score.