| Literature DB >> 32911780 |
Qin Ni1, Zhuo Fan1, Lei Zhang2, Chris D Nugent3, Ian Cleland3, Yuping Zhang1, Nan Zhou1.
Abstract
Activity recognition has received considerable attention in many research fields, such as industrial and healthcare fields. However, many researches about activity recognition have focused on static activities and dynamic activities in current literature, while, the transitional activities, such as stand-to-sit and sit-to-stand, are more difficult to recognize than both of them. Consider that it may be important in real applications. Thus, a novel framework is proposed in this paper to recognize static activities, dynamic activities, and transitional activities by utilizing stacked denoising autoencoders (SDAE), which is able to extract features automatically as a deep learning model rather than utilize manual features extracted by conventional machine learning methods. Moreover, the resampling technique (random oversampling) is used to improve problem of unbalanced samples due to relatively short duration characteristic of transitional activity. The experiment protocol is designed to collect twelve daily activities (three types) by using wearable sensors from 10 adults in smart lab of Ulster University, the experiment results show the significant performance on transitional activity recognition and achieve the overall accuracy of 94.88% on three types of activities. The results obtained by comparing with other methods and performances on other three public datasets verify the feasibility and priority of our framework. This paper also explores the effect of multiple sensors (accelerometer and gyroscope) to determine the optimal combination for activity recognition.Entities:
Keywords: activity recognition; resampling technique; stacked denoising autoencoders; transitional activities; wearable sensors
Mesh:
Year: 2020 PMID: 32911780 PMCID: PMC7570862 DOI: 10.3390/s20185114
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
References of applying deep learning methods on activity recognition (the Acc is accelerometer, the Gyr is gyroscope, the Mag is magnetometer, and the Bar is barometer).
| Reference | Method | Sensor | Activity Classes | Accuracy |
|---|---|---|---|---|
| Gu et al. [ | SDAE | Acc+ Gyr+ Mag+ Bar | stilling, running, walking, upstairs, downstairs, upElevator, downElevato, falsemotion | 94.34% |
| Charissa et al. [ | CNN+ tFFT | Acc+ Gyr | walking sitting, upstairs, downstairs, standing, laying | 95.75% |
| Song-Mi et al. [ | 1D-CNN | Acc | run, walk, still | 92.71% |
| Mario [ | CNN | Acc | walking, sitting, jumping, lying, climbing_up, standing, running, climbing_down | 94% |
| Masaya Inoue et al. [ | RNN | Acc | standing, sitting, downstairs, laying, walking, upstairs | 95.42% |
| Yao et al. [ | CNN+ RNN | Acc+ Gyr | standing, climbStair-down, biking, walking, sitting, climbStair-up | 94.20% |
| Yu et al. [ | LSTM | Acc+ Gyr | walking, upstarirs, standing, sitting, downstairs, laying down | 93.79% |
| Zhang et al. [ | DBN | Acc | walking, running, standing, sitting, upstairs, downstairs, lying | 98.60% |
Figure 1The overall framework of stacked denoising autoencoder (SDAE)-based activity recognition.
The concrete descriptions of twelve human daily activities.
| Class | Activities | Description |
|---|---|---|
| Stationary | standing | The subject stands still and maintains 5 min |
| sleeping | The subject sleeps on the sofa for 5 min and is allowed to do some small movements, such as changing the lying posture | |
| watching TV | The subject watches TV for 5 min when he sits on the sofa in a comfortable position. And changing sitting posture is allowed | |
| Dynamic | walking | The subject walks on treadmill at constant speed for 5 min |
| running | The subject runs on treadmill for 5 min | |
| sweeping | The subject sweeps in room with vacuum cleaner for 5 min | |
| Transitional | stand-to-sit | Standing for 15 s, and then sitting on the sofa, repeat 15 times |
| sit-to-stand | Sitting on the sofa for 10 s, and then standing up, repeat 15 times | |
| stand-to-walk | Standing for 15 s, and then walking for 15 s, repeat 15 times | |
| walk-to-stand | Walking for 15 s, and then standing for 15 s, repeat 15 times | |
| lie-to-sit | Sitting on the sofa for 15 s, and then lying down, repeat 15 times | |
| sit-to-lie | Lying on the sofa, and then sitting on the sofa, repeat 15 times |
Performance of each activity class recognition without resampling.
| Accuracy (%) | Precision (%) | Recall (%) | F1 Score (%) | |
|---|---|---|---|---|
| standing | 97.03 | 94.72 | 97.03 | 95.86 |
| sleeping | 97.32 |
| 97.32 | 97.84 |
| watching TV | 96.82 | 92.54 | 96.82 | 94.63 |
| walking | 86.75 | 85.71 | 86.75 | 86.25 |
| running | 95.73 | 91.81 | 95.73 | 93.73 |
| sweeping | 88.92 | 82.52 | 88.92 | 85.60 |
| stand-to-sit | 62.16 | 63.01 | 62.16 | 62.59 |
| sit-to-stand | 51.19 | 70.49 | 51.19 | 59.31 |
| stand-to-walk | 35.14 | 39.39 | 35.14 | 37.14 |
| walk-to-stand |
| 51.16 | 26.51 | 34.92 |
| lie-to-sit | 73.33 | 68.75 | 73.33 | 70.97 |
| sit-to-lie | 58.73 | 58.73 | 58.73 | 58.73 |
The number of sample of raw data and the number of instance after segmentation and applying the resampling technique.
| Initial Number | After Segmentation | Undersampling | SMOTE | Oversampling | |
|---|---|---|---|---|---|
| standing | 307,061 | 1198 | 238 | 1200 | 1200 |
| sleeping | 307,109 | 1200 | 238 | 1200 | 1200 |
| watching TV | 306,228 | 1196 | 238 | 1200 | 1200 |
| walking | 300,457 | 1174 | 238 | 1200 | 1200 |
| running | 294,676 | 1151 | 238 | 1200 | 1200 |
| sweeping | 302,052 | 1179 | 238 | 1200 | 1200 |
| stand-to-sit | 61,173 | 239 | 238 | 1200 | 1200 |
| sit-to-stand | 61,035 | 238 | 238 | 1200 | 1200 |
| stand-to-walk | 61,881 | 242 | 238 | 1200 | 1200 |
| walk-to-stand | 61,640 | 242 | 238 | 1200 | 1200 |
| lie-to-sit | 61,454 | 240 | 238 | 1200 | 1200 |
| sit-to-lie | 62,089 | 242 | 238 | 1200 | 1200 |
Figure 2Accuracy of each activity class by applying random oversampling, SMOTE, and without resampling. (A0-standing, A1-sleeping, A2-watching TV, A3-walking, A4-running, A5-sweeping, A6-stand-to-sit, A7-sit-to-stand, A8-stand-to-walk, A9-walk-to-stand, A10-lie-to-sit, and A11-sit-to-lie).
The classification performance of each activity by using random oversampling technique.
| Accuracy (%) | Precision (%) | Recall (%) | F1 Score (%) | |
|---|---|---|---|---|
| standing | 96.75 | 95.61 | 96.75 | 95.73 |
| sleeping | 96.74 | 98.79 | 96.74 | 97.75 |
| watching TV | 95.77 | 98.37 | 95.77 | 96.85 |
| walking | 87.34 | 89.46 | 87.34 | 88.39 |
| running | 93.70 | 97.61 | 93.70 | 95.62 |
| sweeping |
| 89.97 | 84.81 | 87.31 |
| stand-to-sit |
| 95.80 | 98.92 | 97.34 |
| sit-to-stand | 95.53 | 96.07 | 95.53 | 95.80 |
| stand-to-walk | 95.92 | 93.39 | 95.92 | 94.64 |
| walk-to-stand | 97.53 | 93.92 | 97.53 | 95.69 |
| lie-to-sit | 97.34 | 96.32 | 97.34 | 96.83 |
| sit-to-lie | 98.31 | 93.57 | 98.31 | 95.88 |
Figure 3The confusion matrix obtained by applying random oversampling.
Figure 4The recognition performance of each activity when using different iterations. (Pretraining learning rate is set to , number of hidden layer is set to 2; fine-tuning learning rate is set to 0.01).
Figure 5The recognition performance of each activity when using different pretraining learning rates (iterations is set to 200; number of hidden layer is set to 2; fine-tuning learning rate is set to 0.01).
Figure 6The recognition performance of each activity when using different fine-tuning learning rates (iterations is set to 200; number of hidden layer is set to 2; pretraining learning rate is set to ).
Figure 7The performance of each activity recognition when selecting different numbers of hidden layer (iterations is set to 200, pretraining learning rate is set to ; fine-tuning learning rate is set to 0.01).
Optimal hyperparameters for neural network.
| Hyperparameter | Value |
|---|---|
| number of hidden layers | 2 |
| number of units per layer | 500 |
| pretraining learning rate |
|
| fine-tuning learning rate | 0.01 |
| iteration | 200 |
| denoising factor | 0.5 |
| data segment size (in seconds) | 5 |
Figure 8The performance changes of each activity recognition when adopting different sensors.
Performance comparison of adopting different sensors.
| Sensors | Accuracy (%) | Precision (%) | Recall (%) | F1 Score (%) |
|---|---|---|---|---|
| Acc | 93.36 | 93.35 | 93.36 | 93.27 |
| Gyro | 75.76 | 74.77 | 75.76 | 72.60 |
| Acc+Gyro | 94.88 | 94.88 | 94.88 | 94.86 |
Performance comparison between SDAE and other methods.
| Methods | Accuracy (%) | Precision (%) | Recall (%) | F1 Score (%) |
|---|---|---|---|---|
| SVM | 90.95 | 90.81 | 90.95 | 90.60 |
| DT | 88.15 | 87.53 | 88.15 | 87.54 |
| KNN | 84.84 | 84.38 | 84.84 | 84.29 |
| CNN | 81.33 | 79.85 | 81.33 | 80.27 |
| LSTM | 81.63 | 83.56 | 81.63 | 81.62 |
| BiLSTM | 84.75 | 85.23 | 84.75 | 84.63 |
| SDAE | 94.88 | 94.88 | 94.88 | 94.86 |
Basic information of the three public datasets.
| Datasets | People | Classes | Sensors | Transitions |
|---|---|---|---|---|
| Smartphone | 30 | 6 | Acc+Gyro | No |
| Chest-mounted | 15 | 7 | Acc | No |
| UCI | 8 | 19 | Acc+Gyro+Mag | No |
Activity recognition performance of SDAE model on other public datasets.
| Datasets | Accuracy(%) | Precision(%) | Recall(%) | F1 Score(%) |
|---|---|---|---|---|
| Smartphone | 97.15 | 97.19 | 97.15 | 97.15 |
| Chest-mounted | 89.99 | 89.96 | 89.99 | 89.83 |
| UCI | 95.26 | 95.42 | 95.26 | 95.15 |