| Literature DB >> 32033072 |
Fadi Al Machot1, Mohammed R Elkobaisi2, Kyandoghere Kyamakya3.
Abstract
Due to significant advances in sensor technology, studies towards activity recognition have gained interest and maturity in the last few years. Existing machine learning algorithms have demonstrated promising results by classifying activities whose instances have been already seen during training. Activity recognition methods based on real-life settings should cover a growing number of activities in various domains, whereby a significant part of instances will not be present in the training data set. However, to cover all possible activities in advance is a complex and expensive task. Concretely, we need a method that can extend the learning model to detect unseen activities without prior knowledge regarding sensor readings about those previously unseen activities. In this paper, we introduce an approach to leverage sensor data in discovering new unseen activities which were not present in the training set. We show that sensor readings can lead to promising results for zero-shot learning, whereby the necessary knowledge can be transferred from seen to unseen activities by using semantic similarity. The evaluation conducted on two data sets extracted from the well-known CASAS datasets show that the proposed zero-shot learning approach achieves a high performance in recognizing unseen (i.e., not present in the training dataset) new activities.Entities:
Keywords: activity recognition; non-visual sensors; sensor data; zero-shot learning
Year: 2020 PMID: 32033072 PMCID: PMC7038698 DOI: 10.3390/s20030825
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Overview of activity recognition based on classical machine learning approaches. k-NN: k-Nearest Neighbor; SVM: Support Vector Machine; RF: Random Forest; MLP: Multi-Layer Perceptron; GMM: Gaussian mixture model; KF: Kalman Filter.
| Paper | Approach | Method | Activity | Input Source | Performance |
|---|---|---|---|---|---|
| [ | Comparison study to classify human activities | SVM, MLP, RF, Naive Bayes | Sleeping, eating, walking, falling, talking on the phone | Image | 86% |
| [ | Hybrid deep learning for activity and action recognition | GMM, KF, Gated Recurrent Unit | Walking, jogging, running, boxing, hand-waving, hand-clapping | Video | 96.3% |
| [ | Infer high-level rules for noninvasive ambient that help to anticipate abnormal activities | RF | Abnormal activities: agitation, alteration, screams, verbal aggression, physical aggression and inappropriate behavior | Ambient sensors | 98.0% |
| [ | Active Learning to recognize human activity using Smartwatch | RF, Extra Trees, Naive Bayes, Logistic Regression, SVM | Running, walking, standing, sitting, lying down | Smartwatch | 93.3% |
| [ | Recognizing human activity using smartphone sensors | Quadratic, k-NN, ANN, SVM | Walking upstairs, downstairs | Smartphone | 84.4% |
Overview of activity recognition based on Deep Learning. SVM: Support Vector Machine; RBM: Restricted Boltzmann Machine; k-NN: k-Nearest Neighbor.
| Paper | Approach | Method | Activity | Input Source | Performance |
|---|---|---|---|---|---|
| [ | Mapping of activity recognition to image classification task | AlexNet, CaffeRef, k-NN, SVM, BoF | Communicating, sleeping, staying, work at computer, reading, writing, studying, eating, drinking | Image | 90.78% |
| [ | Recognizing activity using triaxial accelerometers and deep learning | RBM | Jogging, walking, upstairs, downstairs, sitting, standing | On-body sensors | 98.23% |
| [ | Deep CNN for recognizing activity using smartphone sensors | SVM, ConvNet, FFT | Walking, W. Upstairs, W. Downstairs, Sitting, Standing, Laying | Smartphone | 95.75% |
| [ | Smartwatches and deep learning to recognize human activity | RBM | (Gesture-based activity recognition), (Physical activities: Walking upstairs, downstairs), and (Indoor/Outdoor routine activities) | Ambient sensors, Smartwatch | 72.1% |
Overview of activity recognition-based zero-shot learning. BGRU: Bidirectional Gated Recurrent Unit; GloVe: Global Vectors; ConSE: Convex Combination of Semantic Embeddings.
| Paper | Approach | Method | Activity | Input Source | Performance |
|---|---|---|---|---|---|
| [ | Zero-Shot activity recognition using visual and linguistic attributes | BGRU, GloVe | Drink, uncork, drool, lick | Image | 42.17 % |
| [ | Zero-shot activity-recognition based on a structured knowledge graph | Two-stream GCN method, self-attention mechanism | Biking, Skiing | Video | 59.9% |
| [ | Identify the hierarchical and sequential nature of activity data | Graphical Model of Semantic Attribute Sequences | ArmUp, ArmDown, ArmFwd, ArmBack, ArmSide, ArmCurl, SquatStand | Sequence of signal features | 70–75% |
| [ | Probabilistic framework for zero-shot action recognition | Inductive setting for standard zero-shot | (101+51+16) classes from different datasets | Video | 57.88 ± 14.1% |
| [ | Enable fair use of external data for zero-shot action recognition | ConSE | (51) and (400) classes from two datasets | Video | 25.67 ± 3.5% |
Figure 1Main idea of proposed method.
Figure 2Training phase, where are the training class category labels, are the sensor readings of training activities, and are the corresponding L-dimensional semantic representation vectors of the training labels.
Figure 3Test phase, where are the zero-shot class labels, are the sensor readings of zero-shot activities, and is its corresponding L-dimensional semantic representation vector of the zero-shot class labels.
Figure 4The proposed shallow neural network model.
Count of activities in HH101 and HH125 smarthomes.
| Activity | HH101 | HH125 |
|---|---|---|
| Bathe | 59 | 25 |
| Cook | 13 | 19 |
| Cook Breakfast | 79 | 78 |
| Cook Lunch | 18 | 65 |
| Dress | 139 | 212 |
| Eat Dinner | 22 | 10 |
| Eat Lunch | 14 | 8 |
| Personal Hygiene | 154 | 219 |
| Phone | 37 | 57 |
| Read | 53 | 19 |
| Relax | 92 | 9 |
| Sleep | 284 | 178 |
| Toilet | 369 | 287 |
| Wash Dinner Dishes | 18 | 100 |
| Wash Dishes | 31 | 154 |
| Watch TV | 333 | 218 |
Figure 5Layout of HH101 apartment. The position of each sensor is specified with the corresponding motion (M), light (LS), door (D), temperature (T), or sensor number.
Training vs. zero-shot classes.
| Dataset | Training | Zero-Shot |
|---|---|---|
| Bathe | Sleep | |
| Cook | Toilet | |
| Scenario 1s | Wash Dinner Dishes | Relax |
| Watch TV | ||
| Read | ||
| Cook Breakfast | Cook Lunch | |
| Wash Dishes | Personal Hygiene | |
| Scenario 2 | Phone | Eat Lunch |
| Dress | ||
| Eat Dinner |
Confusion matrix for zero-shot activity recognition—Scenario 1.
| Dataset | Activity | Relax | Sleep | Toilet |
|---|---|---|---|---|
|
| 84 | 0 | 0 | |
| HH101 |
| 7 | 79 | 0 |
|
| 0 | 97 | 352 | |
|
| 1 | 0 | 0 | |
| HH125 |
| 7 | 95 | 0 |
|
| 0 | 97 | 253 |
Performance metrics for zero-shot activity recognition—Scenario 1.
| Dataset | Class | N (Classified) | N (Truth) | Accuracy | Precision | Recall | F-Measure |
|---|---|---|---|---|---|---|---|
|
| 91 | 84 | 98.87 | 1 | 0.92 | 0.96 | |
| HH101 |
| 176 | 86 | 83.2 | 0.92 | 0.45 | 0.6 |
|
| 352 | 449 | 84.33 | 0.78 | 1.0 | 0.88 | |
|
| 8 | 1 | 98.03 | 1.0 | 0.13 | 0.22 | |
| HH125 |
| 95 | 102 | 98.03 | .93 | 1.0 | 0.96 |
|
| 253 | 253 | 100 | 1.0 | 1.0 | 1.0 |
Confusion matrix for zero-shot activity recognition—Scenario 2.
| Dataset | Activity | Cook Lunch | Eat Lunch | Personal Hygiene |
|---|---|---|---|---|
|
| 13 | 0 | 0 | |
| HH101 |
| 0 | 14 | 0 |
|
| 4 | 0 | 154 | |
|
| 64 | 0 | 0 | |
| HH125 |
| 0 | 1 | 0 |
|
| 0 | 6 | 219 |
Performance metrics for zero-shot activity recognition—Scenario 2.
| Dataset | Class | N (Classified) | N (Truth) | Accuracy | Precision | Recall | F-Measure |
|---|---|---|---|---|---|---|---|
|
| 17 | 13 | 97.84 | 1.0 | 0.76 | 0.87 | |
| HH101 |
| 14 | 14 | 100 | 1.0 | 1.0 | 1.0 |
|
| 154 | 158 | 97.84 | 0.97 | 1.0 | 0.99 | |
|
| 64 | 64 | 100 | 1.0 | 1.0 | 1.0 | |
| HH125 |
| 7 | 1 | 97.93 | 1.0 | 0.14 | 0.25 |
|
| 219 | 225 | 97.93 | 0.97 | 1.0 | 0.99 |
Figure 6Correlation between seen and unseen activities in scenario (one). (a) Training set; (b) Testing set.