| Literature DB >> 25268914 |
Natalia Díaz-Rodríguez1, Olmo León Cadahía2, Manuel Pegalajar Cuéllar3, Johan Lilius4, Miguel Delgado Calvo-Flores3.
Abstract
Human activity recognition is a key task in ambient intelligence applications to achieve proper ambient assisted living. There has been remarkable progress in this domain, but some challenges still remain to obtain robust methods. Our goal in this work is to provide a system that allows the modeling and recognition of a set of complex activities in real life scenarios involving interaction with the environment. The proposed framework is a hybrid model that comprises two main modules: a low level sub-activity recognizer, based on data-driven methods, and a high-level activity recognizer, implemented with a fuzzy ontology to include the semantic interpretation of actions performed by users. The fuzzy ontology is fed by the sub-activities recognized by the low level data-driven component and provides fuzzy ontological reasoning to recognize both the activities and their influence in the environment with semantics. An additional benefit of the approach is the ability to handle vagueness and uncertainty in the knowledge-based module, which substantially outperforms the treatment of incomplete and/or imprecise data with respect to classic crisp ontologies. We validate these advantages with the public CAD-120 dataset (Cornell Activity Dataset), achieving an accuracy of 90.1% and 91.07% for low-level and Sensors 2014, 14 18132 high-level activities, respectively. This entails an improvement over fully data-driven or ontology-based approaches.Entities:
Mesh:
Year: 2014 PMID: 25268914 PMCID: PMC4239884 DOI: 10.3390/s141018131
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1.General diagram of the proposed hybrid framework.
Summary of the features vector used for the dynamic time warping (DTW) algorithm.
|
| |
| - Left and right arm. Joint angle shoulder ( | 2 |
| - Left and right arm. Joint angle elbow ( | 2 |
|
| |
|
| |
| - Shortest distance to hand, group by object type (10 objects type) | 10 |
| - Shortest distance to hand, group by object id (maximum number of same objects type: 5) | 5 |
| - Sum of objects distances | 1 |
Average recognition times (in milliseconds) per high-level activity.
| Making cereal | 1,025.94 | 281.95 |
| Taking medicine | 212.9 | 38.01 |
| Stacking objects | 960.12 | 337.02 |
| Unstacking objects | 984.93 | 283.63 |
| Microwaving food | 400.15 | 229.32 |
| Picking object (Bending) | 234.13 | 301.46 |
| Cleaning objects | 480.5 | 197.79 |
| Taking out food | 333.84 | 249.54 |
| Arranging objects | 236.48 | 253.67 |
| Having meal | 733.78 | 224.85 |
|
| ||
| 560.28 | 239.724 | |
Excerpt of fuzzy concepts used in the rules.
Excerpt of fuzzy concepts used in the rules (Part II).
Fuzzy concept definitions.
| Primitive Classes | |
| Sub-activities | |
| High-level activities | |
| Object categories | |
Fuzzy rules for each activity.
Confusion matrix for sub-activity labeling.
| Reaching | 0.94 | 0.01 | 0.01 | 0.02 | 0.01 | 0.04 | ||||
| Moving | 0.02 | 0.89 | 0.02 | 0.01 | 0.01 | 0.04 | 0.02 | |||
| Pouring | 0.02 | 0.96 | 0.02 | |||||||
| Eating | 0.02 | 0.06 | 0.84 | 0.03 | 0.05 | |||||
| Drinking | 0.02 | 0.08 | 0.81 | 0.02 | 0.08 | |||||
| Opening | 0.03 | 0.02 | 0.91 | 0.02 | 0.01 | |||||
| Placing | 0.02 | 0.04 | 0.01 | 0.90 | 0.02 | |||||
| Closing | 0.03 | 0.01 | 0.03 | 0.01 | 0.88 | 0.01 | 0.02 | |||
| Scrubbing | 0.02 | 0.02 | 0.04 | 0.92 | ||||||
| null | 0.02 | 0.02 | 0.01 | 0.01 | 0.93 | |||||
| Reaching | Moving | Pouring | Eating | Drinking | Opening | Placing | Closing | Scrubbing | null |
Average recognition times (in milliseconds) per sub-activity.
| Reaching | 138.87 |
| Moving | 193.9 |
| Pouring | 279.55 |
| Eating | 141.33 |
| Drinking | 178.5 |
| Opening | 284.87 |
| Placing | 142.46 |
| Closing | 241.58 |
| Scrubbing | 532.99 |
| null | 173.02 |
|
| |
| 178.99 | |
Comparison of our approach with Koppula et al. [27] for the CAD-120 dataset (Cornell Activity Dataset) sub-activity recognition. Average Accuracy, Precision and Recall.
| Koppula | 76.8 ± 0.9 | 72.9 ± 1.2 | 70.5 ± 3.0 |
| Our Method | 90.1 ± 8.2 | 91.5 ± 4.6 | 97.0 ± 5.8 |
Semantic description of each high-level activity.
| Making cereal | Take cereal box, bowl and milk (open them) and pour both. | |
| Taking medicine | Take medicine box from cupboard, take glass, eat pill and drink water | |
| Stacking objects | Stack on a pile plates, boxes or bowls | |
| Unstacking objects | Unstack from a pile plates, boxes or bowls | |
| Microwaving food | Take food container or kitchenware, place it into microwave and take it out | |
| Picking object | Pick up an object from the floor | |
| Cleaning objects | Clean up objects (microwave with a cloth) | |
| Taking out food | Take food and heat in microwave | |
| Arranging objects | Arranging on a table, e.g., setting up the table | |
| Having meal | Eating a meal on the table |
Objects' semantic descriptions based on usage-driven object categories.
Fuzzy roles (ontology object and data properties).
| High-level activity performed | User | Activity | ||
| Sub-activity performed | User | SubActivity | ||
| Object interaction within a sub-activity | SubActivity | Object | ||
|
| ||||
|
| ||||
| Sub-activity start frame | SubActivity | integer | functional | |
| Sub-activity end frame | SubActivity | integer | functional | |
| Object position in X-axis | Object | *double* | 0 | |
| Object position in Y-axis | Object | 100,000 *double* | 0 | |
| 100,000 | ||||
| Object position in Z-axis | Object | *double* | 0 | |
| 100,000 | ||||
Confusion matrix for high-level activities taking as input the sub-activities detected in the first stage tracking system.
| Making cereal | 1 | ||||||||||
| Taking medicine | 1 | ||||||||||
| Stacking objects | 0.08 | 0.59 | 0.33 | ||||||||
| Unstacking objects | 1 | ||||||||||
| Microwaving | 0.59 | 0.33 | 0.08 | ||||||||
| Picking objects (Bending) | 0.25 | 0.67 | 0.08 | ||||||||
| Cleaning objects | 1 | ||||||||||
| Takeout food | 0.08 | 0.83 | 0.08 | ||||||||
| Arranging objects | 0.25 | 0.08 | 0.08 | 0.59 | |||||||
| Eating meal | 0.92 | 0.08 | |||||||||
| Making cereal | Taking medicine | Stacking objects | Unstacking objects | Microwaving | Picking objects (Bending) | Cleaning objects | Takeout food | Arranging objects | Eating meal | Null |
Confusion matrix for high-level activities taking as input the sub-activities 100% perfectly labeled from the CAD-120 dataset (ideal condition).
| Making cereal | 1 | ||||||||||
| Taking medicine | 1 | ||||||||||
| Stacking objects | 0.08 | 0.59 | 0.33 | ||||||||
| Unstacking objects | 0.92 | 0.08 | |||||||||
| Microwaving | 0.83 | 0.17 | |||||||||
| Picking objects (Bending) | 1 | ||||||||||
| Cleaning objects | 1 | ||||||||||
| Takeout food | 1 | ||||||||||
| Arranging objects | 0.08 | 0.25 | 0.67 | ||||||||
| Eating Meal | 1 | ||||||||||
| Making cereal | Taking medicine | Stacking objects | Unstacking objects | Microwaving | Picking objects (Bending) | Cleaning objects | Takeout food | Arranging objects | Eating meal | Null |
Comparison of our approach with the dataset's algorithm.
| Koppula | Full model with tracking | 78.6 ± 4.1 | 78.3 ± 4.9 | 79.0 ± 4.7 |
| Ours | Naive approach without heuristic filters | 64.1 ± 7.8 | 75.98 ± 13.53 | 61.36 ± 6.93 |
| Ours | Real situation (predicted from 1st module-detection system) | 84.1 ± 2.3 | 97.4 ± 0.66 | 82.9 ± 0.92 |
| Ours | Ideal situation (labeled, 100% acc.) | 90.8 ± 1.31 | 98.1 ± 1.25 | 91.07 ± 1.28 |
Confusion matrix for high-level activities using a naive approach: without applying heuristic filters for the CAD-120 dataset.
| Making cereal | 0.875 | 0.125 | |||||||||
| Taking medicine | 0.875 | 0.125 | |||||||||
| Stacking objects | 0.167 | 0.75 | 0.083 | ||||||||
| Unstacking objects | 1 | ||||||||||
| Microwaving | 0.83 | 0.583 | 0.083 | 0.25 | |||||||
| picking objects (Bending) | 0.083 | 0.833 | 0.83 | ||||||||
| Cleaning objects | 1 | ||||||||||
| Takeout food | 0.083 | 0.583 | 0.167 | 0.167 | |||||||
| Arranging objects | 0.167 | 0.5 | 0.083 | 0.25 | |||||||
| Eating Meal | 0.083 | 0.667 | 0.25 | ||||||||
| Making cereal | Taking medicine | Stacking objects | Unstacking objects | Microwaving | Picking objects (Bending) | Cleaning objects | Takeout food | Arranging objects | Eating meal | Null |