| Literature DB >> 29690587 |
Alejandro Baldominos1, Yago Saez2, Pedro Isasi3.
Abstract
Human activity recognition is a challenging problem for context-aware systems and applications. It is gaining interest due to the ubiquity of different sensor sources, wearable smart objects, ambient sensors, etc. This task is usually approached as a supervised machine learning problem, where a label is to be predicted given some input data, such as the signals retrieved from different sensors. For tackling the human activity recognition problem in sensor network environments, in this paper we propose the use of deep learning (convolutional neural networks) to perform activity recognition using the publicly available OPPORTUNITY dataset. Instead of manually choosing a suitable topology, we will let an evolutionary algorithm design the optimal topology in order to maximize the classification F1 score. After that, we will also explore the performance of committees of the models resulting from the evolutionary process. Results analysis indicates that the proposed model was able to perform activity recognition within a heterogeneous sensor network environment, achieving very high accuracies when tested with new sensor data. Based on all conducted experiments, the proposed neuroevolutionary system has proved to be able to systematically find a classification model which is capable of outperforming previous results reported in the state-of-the-art, showing that this approach is useful and improves upon previously manually-designed architectures.Entities:
Keywords: convolutional neural networks; deep learning; human activity recognition; neuroevolution
Mesh:
Year: 2018 PMID: 29690587 PMCID: PMC5948523 DOI: 10.3390/s18041288
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1Common topology of a convolutional neural network.
Figure 2Example of how a 1D kernel is used to convolve the input.
Brief comparison of the features of our neuroevolutionary system with related works. Works marked with a dagger (†) do not use evolutionary computation, but rather reinforcement learning. The comparison criteria include whether the proposal supports a variable number of layers (Var. Ly.), and whether it evolves convolutional layers (Conv.), fully-connected layers (FC), recurrent layers (Rec.) or some of their hyper-parameters, activation functions (Act. Fn.), optimization hyperparameters (Opt. HP), ensembles of neural networks (Ens.) or weights (W). CoSyNE: cooperative synapse neuroevolution; NEAT: neuroevolution of augmenting topologies; CMA-ES: covariance matrix adaption evolution strategy; GA: genetic algorithm; GP: genetic programming; CGP: cartesian genetic programming; RL: reinforcement learning; GE: grammatical evolution.
| Work | Technique | Var. Ly. | Conv. | FC | Rec. | Act. Fn. | Opt. HP | Ens. | W |
|---|---|---|---|---|---|---|---|---|---|
| Koutník et al. [ | CoSyNE | • | |||||||
| Verbancsics and Harguess [ | GA (NEAT) | • | • | ||||||
| MENNDL [ | GA | • | |||||||
| Loshchilov and Hutter [ | CMA-ES | • | • | • | |||||
| GeNet [ | GA | • | • | ||||||
| CoDeepNEAT [ | GA (NEAT) | • | • | • | • | • | • | ||
| EXACT [ | GA (NEAT) | • | • | ||||||
| Real et al. [ | GA (NEAT) | • | • | • | |||||
| DEvol [ | GP | • | • | • | • | ||||
| Suganuma et al. [ | CGP | • | • | ||||||
| MetaQNN [ | RL | • | • | • | • | ||||
| Zoph and Le [ | RL | • | • | • | • | • | |||
| This work | GE | • | • | • | • | • | • | • |
Sensors used in the OPPORTUNITY dataset, placed over the body, the objects, and the environment.
| ID | Sensor System | Location and Observation |
|---|---|---|
| B1 | Commercial wireless microphones | Chest and dominant wrist |
| B2 | Custom Bluetooth acceleration sensors [ | 12 on the body to sense limb movement |
| B3 | Custom motion jacket [ | Includes 5 commercial RS485-networked XSens inertial measurement units [ |
| B4 | Custom magnetic relative pos. sensor [ | Senses distance of hand to body |
| B5 | InertiaCube3 [ | One per foot, on the shoe toe box, to sense modes of locomotion |
| B6 | Sun SPOT acceleration sensors | One per foot, right below the outer ankle, to sense modes of locomotion |
| O1 | Custom wireless Bluetooth acceleration and rate of turn sensors | 12 objects in the scenario to measure their use |
| A1 | Commercial wired microphone array | Four in each room side to sense ambient sound |
| A2 | Commercial Ubisense localization system | Placed in the corners of the room to sense user location |
| A3 | Axis network cameras | Placed in three locations for localization, documentation, and visual annotation |
| A4 | XSens inertial sensor [ | Placed on the table and the chair to sense vibration and use |
| A5 | USB networked acceleration sensors [ | 8 placed on doors, drawers, shelves, and the lazy chair to sense usage |
| A6 | Reed switches | 13 placed on doors, drawers and shelves, to sense usage providing ground truth |
| A7 | Custom power sensors | Connected to coffee machine and bread cutter to sense usage |
| A8 | Custom pressure sensors | 3 placed on the table to sense usage after subjects placed plates and cups on them |
Side-by-side comparison of the most relevant results provided in the state-of-the-art for the OPPORTUNITY dataset, including both the locomotion and the gesture recognition tracks, with and without null instances. The dagger (†) near some values indicate that performance was reported in a subject-per-subject basis, and are the outcome of averaging the F1 score for subjects 2 and 3. NN: nearest neighbors; SPO: structure preserving oversampling; SVM: support vector machines; LDA: linear discriminant analysis; QDA: quadratic discriminant analysis; NCC: nearest centroid classifier; LSTM: long short-term memory; CNN: convolutional neural network; DNN: deep feed-forward neural network; MV: means and variance; DBN: deep belief network; UP, MI, MU, NU and UT are not technical acronyms but the names given by Chavarriaga et al. [45] to different works based on the name of the institutions participating in the OPPORTUNITY challenge.
| Technique | Locomotion | Gestures | ||
|---|---|---|---|---|
| with | no | with | no | |
| CStar [ | 0.63 | 0.87 | 0.88 | 0.77 |
| 1-NN [ | 0.84 | 0.85 | 0.87 | 0.55 |
| SStar [ | 0.64 | 0.86 | 0.86 | 0.70 |
| 3-NN [ | 0.85 | 0.85 | 0.85 | 0.56 |
| NStar [ | 0.61 | 0.86 | 0.84 | 0.65 |
| Integrated Framework [ | – | 0.927 | 0.821 | – |
| SPO + 1NN + Smooth. [ | – | 0.917 | 0.811 | – |
| SPO + SVM + Smooth. [ | – | 0.897 | 0.804 | – |
| SPO + SVM [ | – | 0.885 | 0.797 | – |
| SVM [ | – | 0.883 | 0.762 | – |
| SPO + 1NN [ | – | 0.890 | 0.777 | – |
| 1NN [ | – | 0.890 | 0.705 | – |
| LDA [ | 0.59 | 0.64 | 0.69 | 0.25 |
| UP [ | 0.60 | 0.84 | 0.64 | 0.22 |
| QDA [ | 0.68 | 0.77 | 0.53 | 0.24 |
| NCC [ | 0.54 | 0.60 | 0.51 | 0.19 |
| MI [ | 0.83 | 0.86 | – | – |
| MU [ | 0.62 | 0.87 | – | – |
| NU [ | 0.53 | 0.75 | – | – |
| UT [ | 0.52 | 0.73 | – | – |
| b-LSTM-S [ | – | – |
| – |
| DeepConvLSTM [ |
|
| 0.915 |
|
| LSTM-S [ | – | – | 0.912 | – |
| LSTM-F [ | – | – | 0.908 | – |
| CNN [ | – | – | 0.894 | – |
| DNN [ | – | – | 0.888 | – |
| Baseline CNN [ | 0.878 | 0.912 | 0.883 | 0.783 |
| CNN + Smooth. [ | – | – | 0.822 | – |
| CNN [ | – | – | 0.818 | – |
| MV + Smooth. [ | – | – | 0.788 | – |
| MV [ | – | – | 0.778 | – |
| DBN [ | – | – | 0.701 | – |
| DBN + Smooth. [ | – | – | 0.700 | – |
Figure 3Definition of the grammar in Backus–Naur Form for the OPPORTUNITY dataset.
Figure 4Derivation tree of a sample individual.
List of parameters used in grammatical evolution, with their values.
| Parameter | Symbol | Value |
|---|---|---|
| Population size |
| 50 |
| Maximum number of generations |
| 100 |
| Number of generations without improvements (stop condition) |
| 30 |
| Codon size | 256 | |
| Maximum chromosome length | 100 | |
| Tournament size |
| 3 |
| Crossover rate |
| 0.7 |
| Mutation rate |
| 0.015 |
| Elite size |
| 1 |
Architecture and fitness of the top seven individuals in the hall-of-fame for GE in the OPPORTUNITY Gestures dataset. GRU: gated recurrent unit; LSTM: long short-term memory; ReLU: rectified linear unit.
| # | Fitness | Architecture | |||||
|---|---|---|---|---|---|---|---|
| 1 | 0.9094 |
|
|
|
|
| |
|
|
|
|
| ||||
| c | |
|
|
|
| |||
|
|
|
|
| ||||
|
|
|
|
| ||||
|
|
|
|
| ||||
| d | |
|
|
|
|
| ||
| 2 | 0.9037 |
|
|
|
|
| |
| c | |
|
|
|
| |||
|
|
|
|
| ||||
| d | |
|
|
|
|
| ||
| 3 | 0.9031 |
|
|
|
|
| |
| c | |
|
|
|
| |||
|
|
|
|
| ||||
|
|
|
|
| ||||
|
|
|
|
| ||||
| d | |
|
|
|
|
| ||
| 4 | 0.9025 |
|
|
|
|
| |
| c | |
|
|
|
| |||
|
|
|
|
| ||||
| d | |
|
|
|
|
| ||
| 5 | 0.9013 |
|
|
|
|
| |
| c | |
|
|
|
| |||
|
|
|
|
| ||||
|
|
|
|
| ||||
| d | |
|
|
|
|
| ||
| 6 | 0.9013 |
|
|
|
|
| |
| c | |
|
|
|
| |||
|
|
|
|
| ||||
|
|
|
|
| ||||
| d | |
|
|
|
|
| ||
| 7 | 0.9010 |
|
|
|
|
| |
|
|
|
|
| ||||
| c | |
|
|
|
| |||
|
|
|
|
| ||||
|
|
|
|
| ||||
|
|
|
|
| ||||
| d | |
|
|
|
|
| ||
Summary of F1 scores of the best 20 GE individuals after full training in the OPPORTUNITY Gestures dataset.
| # | Mean | Std. Dev. | Median | Minimum | Maximum |
|---|---|---|---|---|---|
| 1 | 0.912150 | 0.001528 | 0.91235 | 0.9091 | 0.9148 |
| 2 | 0.907910 | 0.002371 | 0.90735 | 0.9029 | 0.9114 |
| 3 | 0.911230 | 0.002044 | 0.91185 | 0.9066 | 0.9143 |
| 4 | 0.909540 | 0.001801 | 0.91020 | 0.9065 | 0.9116 |
| 5 | 0.907695 | 0.002126 | 0.90740 | 0.9040 | 0.9110 |
| 6 | 0.907805 | 0.002299 | 0.90820 | 0.9030 | 0.9121 |
| 7 | 0.912155 | 0.001004 | 0.91215 | 0.9102 | 0.9143 |
| 8 | 0.912635 | 0.002196 | 0.91295 | 0.9079 | 0.9175 |
| 9 | 0.910900 | 0.001453 | 0.91065 | 0.9090 | 0.9146 |
| 10 | 0.911030 | 0.002295 | 0.91180 | 0.9058 | 0.9138 |
| 11 | 0.907885 | 0.002182 | 0.90755 | 0.9043 | 0.9122 |
| 12 | 0.907995 | 0.002917 | 0.90790 | 0.9040 | 0.9140 |
| 13 | 0.912040 | 0.002446 | 0.91135 | 0.9072 | 0.9177 |
| 14 | 0.898005 | 0.005668 | 0.89610 | 0.8918 | 0.9076 |
| 15 | 0.911005 | 0.001863 | 0.91100 | 0.9070 | 0.9139 |
| 16 | 0.910800 | 0.001365 | 0.91070 | 0.9082 | 0.9136 |
| 17 | 0.910945 | 0.001438 | 0.91075 | 0.9085 | 0.9144 |
| 18 | 0.910730 | 0.004045 | 0.91180 | 0.9005 | 0.9185 |
| 19 | 0.911695 | 0.001979 | 0.91175 | 0.9091 | 0.9176 |
| 20 | 0.912015 | 0.002713 | 0.91250 | 0.9047 | 0.9152 |
Figure 5Boxplot showing the distribution of F1 scores of the best 20 GE individuals after full training in OPPORTUNITY Gestures.
Figure 6Evolution of the F1 scores of the incremental ensembles using the best 20 individuals from the GE with OPPORTUNITY Gestures.
Figure 7Confusion matrix of the best ensemble using GE individuals with OPPORTUNITY.