| Literature DB >> 35009865 |
Imran Ullah Khan1, Sitara Afzal1, Jong Weon Lee1.
Abstract
In recent years, Human Activity Recognition (HAR) has become one of the most important research topics in the domains of health and human-machine interaction. Many Artificial intelligence-based models are developed for activity recognition; however, these algorithms fail to extract spatial and temporal features due to which they show poor performance on real-world long-term HAR. Furthermore, in literature, a limited number of datasets are publicly available for physical activities recognition that contains less number of activities. Considering these limitations, we develop a hybrid model by incorporating Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) for activity recognition where CNN is used for spatial features extraction and LSTM network is utilized for learning temporal information. Additionally, a new challenging dataset is generated that is collected from 20 participants using the Kinect V2 sensor and contains 12 different classes of human physical activities. An extensive ablation study is performed over different traditional machine learning and deep learning models to obtain the optimum solution for HAR. The accuracy of 90.89% is achieved via the CNN-LSTM technique, which shows that the proposed model is suitable for HAR applications.Entities:
Keywords: convolutional neural network; deep learning; human activity recognition; long short-term memory; machine learning; skeleton data
Mesh:
Year: 2022 PMID: 35009865 PMCID: PMC8749555 DOI: 10.3390/s22010323
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1The extracted skeleton of the human body while performing different activities.
Figure 2Different skeleton joints of the human body are extracted through sensors.
Shows the dataset collection and activities details.
| Labels | Activity Name | Participants | Time/Activity | Samples/Activity | Frame/Per Sec |
|---|---|---|---|---|---|
| 1 | Overhead Arm Raise | 20 | 10 s | 200 | 30 |
| 2 | Front Arm Raise | 20 | 10 s | 200 | 30 |
| 3 | Arm Curl | 20 | 10 s | 200 | 30 |
| 4 | Chair Stand | 20 | 10 s | 200 | 30 |
| 5 | Balance Walk | 20 | 10 s | 200 | 30 |
| 6 | Side Leg Raise (Right, Left) | 20 | 10 s | 200 | 30 |
| 7 | Shoulder | 20 | 10 s | 200 | 30 |
| 8 | Chest | 20 | 10 s | 200 | 30 |
| 9 | Leg Raise (Forward, Backward) | 20 | 10 s | 200 | 30 |
| 10 | Arm Circle | 20 | 10 s | 200 | 30 |
| 11 | Side Twist (Right, Left) | 20 | 10 s | 200 | 30 |
| 12 | Squats | 20 | 10 s | 200 | 30 |
Figure 3Activity recognition through different Machine learning algorithms.
Figure 4(a) Represents the standard RNN unit, (b) represents the standard LSTM unit.
Figure 5The overall framework of the proposed hybrid CNN-LSTM approach.
Parameters setting of our proposed model.
| Layer (Type) | Kernel Size | Filter Size | No. of Param. |
|---|---|---|---|
| 1D CNN Layer 1 | 3 | 64 | 9664 |
| 1D CNN Layer 2 | 3 | 128 | 24,704 |
| MaxPooling 1D | - | - | - |
| LSTM(64) | - | - | 46,408 |
| LSTM(64) | - | - | 33,024 |
| Flatten | - | - | - |
| Dense(12) | - | - | 780 |
| Total parameters | - | - | 117,580 |
Shows the accuracy of different machine learning classifiers on different sequences.
| No. | Classifiers | Frames Sequence | ||||
|---|---|---|---|---|---|---|
| 30 | 60 | 90 | 120 | 150 | ||
| 1 | FT | 45.2 | 60.3 | 47.0 | 69.0 | 46.8 |
| 2 | MT | 32.3 | 41.0 | 31.4 | 48.1 | 32.7 |
| 3 | CT | 20.8 | 27.7 | 21.4 | 27.2 | 19.5 |
| 4 | LD | 38.9 | 45.0 | 23.4 | 17.9 | 18.7 |
| 5 | GNB | 44.7 | 45.2 | 47.7 | 58.3 | 46.9 |
| 6 | KNB | 62.3 | 67.0 | 62.0 | 76.6 | 59.3 |
| 7 | LSVM | 53.5 | 73.6 | 53.5 | 78.0 | 48.9 |
| 8 | QSVM | 79.4 | 81.2 | 78.4 | 80.9 | 70.5 |
| 9 | CSVM | 81.3 | 82.0 | 78.3 | 82.4 | 71.9 |
| 10 | FGSVM | 82.4 | 81.1 | 79.5 | 80.8 | 72.9 |
| 11 | MGSVM | 80.0 | 82.2 | 76.1 | 82.2 | 70.1 |
| 12 | CGSVM | 51.1 | 63.9 | 43.4 | 77.9 | 41.8 |
| 13 | FKNN | 79.8 | 80.8 | 79.5 | 81.0 | 70.0 |
| 14 | MKNN | 79.2 | 80.3 | 77.6 | 81.8 | 69.1 |
| 15 | CRSKNN | 65.9 | 66.4 | 50.5 | 70.5 | 43.4 |
| 16 | CSNKNN | 81.6 | 82.1 | 75.1 | 79.4 | 69.8 |
| 17 | CBCKNN | 78.6 | 81.6 | 68.2 | 80.6 | 65.3 |
| 18 | WKNN | 79.0 | 81.1 | 72.3 | 80.9 | 65.6 |
| 19 | EBST | 45.0 | 57.3 | 46.3 | 64.4 | 48.8 |
| 20 | EBGT | 80.8 | 82.3 | 76.2 | 82.4 | 70.4 |
| 21 | ESD | 41.1 | 54.2 | 37.8 | 66.5 | 25.2 |
| 22 | ESKNN | 80.7 | 82.1 | 76.6 | 82.2 | 67.8 |
| 23 | ERUSBT | 42.5 | 46.1 | 47.1 | 57.4 | 43.2 |
| 24 | NNN | 70.9 | 76.1 | 70.8 | 81.4 | 63.4 |
| 25 | MNN | 76.3 | 81.6 | 77.9 | 82.8 | 70.9 |
| 26 | WNN | 80.6 | 82.2 | 79.2 | 81.8 | 75.1 |
| 27 | BNN | 73.9 | 79.0 | 71.3 | 80.0 | 62.2 |
| 28 | TNN | 70.6 | 81.3 | 72.3 | 82.2 | 58.6 |
Figure 6Comparison graph of different machine learning classifiers various types of sequences.
Shows the accuracy of our hybrid approach as compared to other deep learning models.
| No. | Model Name | Frames Sequence | ||||
|---|---|---|---|---|---|---|
| 30 | 60 | 90 | 120 | 150 | ||
| 1 | MLP | 85.45 | 83.64 | 83.47 | 87.05 | 71.51 |
| 2 | CNN | 88.82 | 88.22 | 87.65 | 83.74 | 75.47 |
| 3 | LSTM | 83.31 | 80.64 | 74.69 | 82.92 | 66.09 |
| 4 | BiLSTM | 90.15 | 85.39 | 89.30 | 82.02 | 66.26 |
| 5 | CNN-LSTM | 90.89 | 88.98 | 90.44 | 87.94 | 76.50 |
The precision score of proposed techniques and other DL models on different sequences.
| No. | Model Name | Frames Sequence | ||||
|---|---|---|---|---|---|---|
| 30 | 60 | 90 | 120 | 150 | ||
| 1 | MLP | 86.18 | 84.37 | 85.12 | 88.54 | 74.97 |
| 2 | CNN | 89.20 | 88.48 | 88.37 | 83.93 | 78.04 |
| 3 | LSTM | 83.94 | 82.51 | 74.95 | 84.04 | 64.01 |
| 4 | BiLSTM | 90.74 | 85.90 | 89.62 | 82.52 | 70.35 |
| 5 | CNN-LSTM | 91.11 | 89.31 | 91.13 | 88.82 | 76.13 |
Recall Score of the proposed method and other DL models on different sequences.
| No. | ModelName | Frames Sequence | ||||
|---|---|---|---|---|---|---|
| 30 | 60 | 90 | 120 | 150 | ||
| 1 | MLP | 85.39 | 83.43 | 83.58 | 86.86 | 71.92 |
| 2 | CNN | 88.86 | 88.07 | 87.77 | 83.50 | 75.36 |
| 3 | LSTM | 83.24 | 81.23 | 74.15 | 82.84 | 65.89 |
| 4 | BiLSTM | 90.05 | 85.24 | 89.41 | 82.11 | 67.16 |
| 5 | CNN-LSTM | 90.84 | 88.79 | 90.56 | 88.10 | 75.82 |
Figure 7Confusion of CNN-LSTM on different frames sequences.
Figure 8Comparison graph of the proposed model with other DL models.