| Literature DB >> 35601253 |
Dipanwita Thakur1, Suparna Biswas2.
Abstract
Human activity recognition (HAR) is an eminent area of research due to its extensive scope of applications in remote health monitoring, sports, smart home, and many more. Smartphone-based HAR systems use high-dimensional sensor data to infer human physical activities. Researchers continuously endeavor to select pertinent and non-redundant features without compromising the classification accuracy. In this work, our aim is to build an efficient HAR model that not only extracts the most relevant features from the 3-axial accelerometer and gyroscope signal data but also enhances the classification accuracy of the HAR system, without data loss using time-frequency domain features. We propose a feature selection method based on guided regularized random forest (GRRF) to determine the most pertinent and non-redundant features to reduce the time to recognize the human activities efficiently. After selecting the most relevant features, a support vector machine (SVM) is used to identify various human physical activities. The UCI public dataset and a self-collected dataset are used to assess the generalization capability and performance of the proposed feature selection method. Eventually, the accuracy reached 99.10% and 99.30% on the self-collected and UCI HAR datasets, respectively.Entities:
Keywords: Feature extraction; Feature selection; Guided regularized random forest; Human activity recognition; Smartphone sensors
Year: 2022 PMID: 35601253 PMCID: PMC9103613 DOI: 10.1007/s12652-022-03862-5
Source DB: PubMed Journal: J Ambient Intell Humaniz Comput
Fig. 1Work flow of proposed HAR model
Average number of selected features using ReliefF, GRF and GRRF for both the data sets
| Dataset | Total instances | Total features | Selected features | ||
|---|---|---|---|---|---|
| ReliefF | GRF | GRRF | |||
| Our own Data set | 15562 | 164 | 102 | 94 | 52 |
| UCI Data set | 10299 | 561 | 358 | 247 | 185 |
Average performance of the classifiers using self collected data
| Methods | Measures | Classifiers | |||
|---|---|---|---|---|---|
| DT | NB | RF | SVM | ||
| ReliefF | Precision | 86.13% | 87.04% | 89.74% | 90.48% |
| Recall | 86.46% | 87.37% | 89.67% | 91.79% | |
| F-score | 86.29% | 87.20% | 89.70% | 91.13% | |
| Accuracy | 88.37% | 89.97% | 92.57% | 93.10% | |
| GRF | Precision | 89.83% | 91.62% | 91.86% | 95.78% |
| Recall | 90.24% | 91.94% | 92.63% | 95.91% | |
| F-score | 90.03% | 91.78% | 92.24% | 95.84% | |
| Accuracy | 93.74% | 94.48% | 95.48% | 98.01% | |
| GRRF | Precision | 92.57% | 93.27% | 95.27% | 97.28% |
| Recall | 92.65% | 93.37% | 95.37% | 97.13% | |
| F-score | 92.61% | 93.32% | 95.32% | 97.20% | |
| Accuracy | 94.59% | 96.54% | 97.74% | 99.10% | |
Average training and testing time using self collected data
| Classifier | Training time | Testing time | ||||
|---|---|---|---|---|---|---|
| ReliefF | GRF | GRRF | ReliefF | GRF | GRRF | |
| DT | 42.8s | 39.3s | 37.7s | 27.2s | 22.4s | 14.5s |
| NB | 34.7s | 31.8s | 28.5s | 23.7s | 19.3s | 11.3s |
| RF | 56.4s | 51.7s | 47.5s | 31.4s | 25.7s | 19.6s |
| SVM | 32.6s | 28.9s | 24.4s | 21.8s | 16.2s | 9.7s |
Confusion matrix of proposed method for HAR using SVM with self collected dataset
| Actual classes | |||||||
|---|---|---|---|---|---|---|---|
| Activities | Predicted class | Recall | |||||
| Walk | Up | Down | Sit | Stand | Lie | ||
| Walk | 887 | 8 | 10 | 0 | 0 | 0 | 98.01% |
| Up | 28 | 605 | 12 | 0 | 0 | 0 | 93.80% |
| Down | 6 | 15 | 635 | 0 | 15 | 0 | 94.63% |
| Sit | 0 | 0 | 0 | 789 | 5 | 7 | 98.50% |
| Stand | 0 | 0 | 10 | 1 | 772 | 4 | 98.09% |
| Lie | 0 | 0 | 0 | 1 | 1 | 858 | 99.77% |
| Precision | 96.31% | 96.34% | 95.20% | 99.75% | 97.35% | 98.73% | 99.10% |
Fig. 2Classification accuracy of SVM classifier for all the six activities using our collected dataset
Fig. 3ROC Curve for the classifiers using self-collected dataset
Average performance of the classifiers using UCI public dataset
| Methods | Measures | Classifiers | |||
|---|---|---|---|---|---|
| DT | NB | RF | SVM | ||
| Relief-F | Precision | 88.93% | 89.44% | 91.24% | 92.41% |
| Recall | 89.16% | 89.67% | 91.47% | 92.88% | |
| F-score | 89.04% | 90.06% | 91.35% | 92.64% | |
| Accuracy | 90.86% | 91.57% | 93.57% | 94.23% | |
| GRF | Precision | 92.41% | 93.62% | 93.72% | 95.38% |
| Recall | 92.65% | 93.74% | 93.77% | 95.71% | |
| F-score | 92.53% | 93.68% | 93.74% | 95.54% | |
| Accuracy | 94.44% | 94.98% | 95.48% | 96.83% | |
| GRRF | Precision | 94.17% | 95.27% | 97.28% | 97.90% |
| Recall | 94.35% | 95.39% | 97.37% | 98.00% | |
| F-score | 94.26% | 95.33% | 97.32% | 97.93% | |
| Accuracy | 95.74% | 97.54% | 98.79% | 99.30% | |
Average training and testing time using UCI public dataset
| Classifier | Training Time | Testing Time | ||||
|---|---|---|---|---|---|---|
| Relief-F | GRF | GRRF | Relief-F | GRF | GRRF | |
| DT | 107.6s | 98.6s | 88.7s | 54.2s | 49.2s | 38.6s |
| NB | 97.8s | 89.3s | 77.9s | 48.3s | 41.6s | 33.7s |
| RF | 114.8s | 103.4s | 93.8s | 57.1s | 51.8s | 43.6s |
| SVM | 82.8s | 76.3s | 68.4s | 42.7s | 34.3s | 27.7s |
Confusion matrix of proposed method for HAR using SVM with UCI dataset
| Actual classes | |||||||
|---|---|---|---|---|---|---|---|
| Activities | Predicted class | Recall | |||||
| Walk | Up | Down | Sit | Stand | Lie | ||
| Walk | 481 | 1 | 7 | 0 | 0 | 0 | 98.36% |
| Up | 8 | 463 | 12 | 0 | 0 | 0 | 95.86% |
| Down | 8 | 13 | 411 | 0 | 0 | 0 | 95.14% |
| Sit | 0 | 0 | 0 | 479 | 1 | 0 | 99.79% |
| Stand | 0 | 0 | 0 | 5 | 534 | 0 | 99.63% |
| Lie | 0 | 0 | 0 | 0 | 0 | 524 | 100.00% |
| Precision | 96.78% | 97.06% | 95.58% | 98.97% | 99.81% | 100.00% | 99.30% |
Fig. 4Classification accuracy of SVM classifier for all the six activities using UCI public dataset
Fig. 5ROC Curve for the classifiers using UCI public dataset
Comparison with state-of-the-arts w.r.t accuracy
| Dataset | Methods | Classifiers | |||
|---|---|---|---|---|---|
| DT | NB | RF | SVM | ||
| Self-collected | IG | 86.23% | 88.72% | 90.58% | 95.32% |
| Chi-square test | 88.56% | 91.37% | 92.77% | 97.28% | |
| Forward selection | 92.19% | 93.48% | 94.88% | 98.57% | |
| Backward selection | 91.24% | 91.54% | 92.24% | 96.21% | |
| mRMR | 93.89% | 95.04% | 95.89% | 98.88% | |
| CFS | 93.56% | 94.72% | 95.03% | 97.83% | |
| FCBF | 92.01% | 94.12% | 94.39% | 96.49% | |
| GRRF | 94.59% | 96.54% | 97.74% | 99.10% | |
| UCI | IG | 87.31% | 91.52% | 91.89% | 95.64% |
| Chi-square test | 89.78% | 92.34% | 93.23% | 97.89% | |
| Forward selection | 92.67% | 93.49% | 96.18% | 98.92% | |
| Backward selection | 92.05% | 91.14% | 94.67% | 97.32% | |
| mRMR | 94.23% | 96.43% | 98.01% | 99.03% | |
| CFS | 93.89% | 95.57% | 97.55% | 98.47% | |
| FCBF | 92.85% | 94.53% | 96.45% | 97.12% | |
| GRRF | 95.74% | 97.54% | 98.79% | 99.30% | |
Comparision with DL methods using UCI public dataset
| Method | Walking | Sitting | Standing | Walking Upstairs | Walking downstairs | Lying | Average |
|---|---|---|---|---|---|---|---|
| CNN | 95.56 | 96.89 | 97.54 | 93.68 | 94.58 | 99.35 | 96.27 |
| LSTM | 95.89 | 97.12 | 97.93 | 94.12 | 95.78 | 99.78 | 96.77 |
| Handcrafted features + CNN | 98.74 | 99.36 | 99.39 | 97.45 | 98.58 | 100.00 | 98.92 |
| Handcrafted features + LSTM | 98.94 | 99.56 | 99.40 | 97.67 | 98.67 | 100.00 | 99.04 |
| Proposed method + SVM | 99.50 | 99.89 | 99.74 | 98.19 | 98.48 | 100.00 | 99.30 |