| Literature DB >> 29987218 |
Federico Cruciani1, Ian Cleland2, Chris Nugent3, Paul McCullagh4, Kåre Synnes5, Josef Hallberg6.
Abstract
Data annotation is a time-consuming process posing major limitations to the development of Human Activity Recognition (HAR) systems. The availability of a large amount of labeled data is required for supervised Machine Learning (ML) approaches, especially in the case of online and personalized approaches requiring user specific datasets to be labeled. The availability of such datasets has the potential to help address common problems of smartphone-based HAR, such as inter-person variability. In this work, we present (i) an automatic labeling method facilitating the collection of labeled datasets in free-living conditions using the smartphone, and (ii) we investigate the robustness of common supervised classification approaches under instances of noisy data. We evaluated the results with a dataset consisting of 38 days of manually labeled data collected in free living. The comparison between the manually and the automatically labeled ground truth demonstrated that it was possible to obtain labels automatically with an 80⁻85% average precision rate. Results obtained also show how a supervised approach trained using automatically generated labels achieved an 84% f-score (using Neural Networks and Random Forests); however, results also demonstrated how the presence of label noise could lower the f-score up to 64⁻74% depending on the classification approach (Nearest Centroid and Multi-Class Support Vector Machine).Entities:
Keywords: automatic annotation; human activity recognition; inertial sensors; label noise; smartphone; supervised machine learning
Mesh:
Year: 2018 PMID: 29987218 PMCID: PMC6068801 DOI: 10.3390/s18072203
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Comparison of recent studies on HAR adopting supervised or semi-supervised approach. In most cases dataset is assumed to be fully labeled, while only some studies explored ways to reduce the burden of data annotation. Target Activities (Ly = Lying, Si = Sitting, St = Standing, Wa = Walking, WU = Walking Upstairs, WD = Walking Downstairs, Ru = Running, Be = Bend, Fa = Fall, Da = Dancing, Cy = Cycling, Tr = Transportation, StS = Sit-to-Stand, StL = Sit-to-Lie, LtS = Lie-to-Stand, Tu = Turning, ShT = Sharp Turning).
| Ref | Sensors | Target activities | Labeling | Accuracy |
|---|---|---|---|---|
| Stikic et al. 2011 [ | accelerometer | Si, St, Wa, Tr, some ADLs | Partial (label propagation) | 76% |
| Siirtola et al. 2012 [ | accelerometer | Idle (Si/St), Wa, Ru, Cy, Tr | Full | 95% |
| Anguita et al. 2013 [ | accelerometer | Si, St, Wa, WU, WD | Full | 89% |
| Pei et al. 2013 [ | accelerometer, gyroscope | Si, St, Wa, Wa (fast), Tu, ShT | Full | 92% |
| Bayat et al. 2014 [ | accelerometer | Wa (fast), Wa (slow), Ru WD, WU, Da | Full | 91% |
| Cleland et al. 2014 [ | accelerometer | St, Wa, Ru, Tr (bus) | Partial (user prompt) | 85% |
| Bhattacharya et al. 2014 [ | accelerometer, gyroscope | Idle (Si/St), Wa, Tr (bus), Tr (tram), Tr (train) | Partial | 79% |
| Reyes-Ortiz et al. 2016 [ | accelerometer, gyroscope | Ly, Si, St, Wa, WU, WD, StS, StL, LtS | Full | 96% |
| Ronao et al. 2016 [ | accelerometer, gyroscope | Ly, Si, St, Wa, WU, WD | Full | 94% |
| Hong et al. 2016 [ | accelerometer | Ly, Si, St, Wa, Cy, Be, Fa | Partial (label propagation) | 83% |
| Hassan et al. 2018 [ | accelerometer, gyroscope | Ly, Si, St, Wa, WU, WD, StS, StL, LtS | Full | 89% |
| Cao et al. 2018 [ | accelerometer, gyrsocope | Si, St, Wa, WU, WD | Full | 94% |
| San-Segundo et al. 2018 [ | accelerometer | Si, St, Wa, WU, WD, Cy | Full | 91% |
Figure 1The smartphone app collects raw data samples from the on-board accelerometer and the GPS. The GPS is combined with the step count to generate weak labels using the heuristic function. Extracted features are sent to the server along with the labels in order to train the model. Parameters of the classifier are sent back to the app in Predictive Model Markup Language (PMML) format. Finally, the classification can be performed locally on the smartphone instantiating the appropriate classifier based on the parameters.
Figure 2Fuzzification of probabilities values for the heuristic function. Step count heuristic (a) is modeled using Gaussian membership functions with average and standard deviation based on common steps/minute rates for walking and running. Similarly, (b) shows the trapezoidal membership functions used to estimate probability based on measured speed in m/s, for walking, running, cycling and using some means of transportation.
Figure 3Comparison of ground truth (a) with heuristic generated annotation (b).
The set of features used in the experiment.
| Domain | Signal | Features |
|---|---|---|
| Time | 3D Magnitude of Acceleration | Mean, Variance, Min, Max, Range, Skewness, Kurtosis |
| Time | Absolute value of Mean, Variance, Range | |
| Frequency | 3D Magnitude of Acceleration | Number of peaks in PSD, Location of highest peak |
Figure 4Screenshot of the labeling app showing the buttons to label activities with the ’Walking’ activity selected as currently on-going. The error button allows the user to signal whenever an error occurs in the labeling, allowing for ignoring incorrect annotations for the final evaluation.
Structure of the dataset. For each data point, the first column contains the manually annotated labels while the second column contains the label generated automatically.
| Label | Weak Label | Feature 1 | ⋯ | Feature n |
|---|---|---|---|---|
| TRANSPORTATION | SITTING |
| ⋯ |
|
| TRANSPORTATION | TRANSPORTATION |
| ⋯ |
|
| ⋯ | ⋯ | ⋯ | ⋯ | |
| WALKING | WALKING |
| ⋯ |
|
Number of samples with manual labeling composing the ground truth.
| Class | Samples | Hours:Minutes:Seconds |
|---|---|---|
| Sitting | 395004 | 56:01:51 |
| Walking | 169908 | 26:09:26 |
| Running | 8250 | 01:20:09 |
| Cycling | 19132 | 02:40:35 |
| Transportation | 75244 | 11:47:21 |
|
|
|
|
Figure 5Accuracy of heuristic function over the 38 days composing the dataset. Label precision has been calculated as the number of correct labels divided by the total number of weak labels. Percentage of missing labels has been calculated as the number of missing labels divided by the number of total labels manually annotated.
Mean values for precision, recall and f-score obtained using 10-fold cross validation for all classifiers.
| Algorithm | Precision | Recall | F-Score |
|---|---|---|---|
| Nearest Centroid | 0.6816 | 0.6334 | 0.6418 |
| DT | 0.8249 | 0.7878 | 0.7979 |
| Random Forests | 0.8666 | 0.8299 | 0.8394 |
| kNN | 0.8355 | 0.8079 | 0.81405 |
| Multi-class SVM | 0.7630 | 0.7414 | 0.7424 |
| NN 18 × 12 × 6 | 0.8394 | 0.8127 | 0.8188 |
| NN 18 × 36 × 12 × 6 | 0.8585 | 0.8345 | 0.8410 |
Confusion Matrix obtained using the NN that measured the highest f-score.
| Sitting | Walking | Running | Cycling | Transportation | |
|---|---|---|---|---|---|
|
|
| 0.03365 | 0.00481 | 0.0625 | 0.01442 |
|
| 0.03967 |
| 0.04802 | 0.09395 | 0.02505 |
|
| 0.00785 | 0.02356 |
| 0.00524 | 0.02356 |
|
| 0.01734 | 0.07514 | 0.01156 |
| 0.04046 |
|
| 0.02588 | 0.03512 | 0.01109 | 0.14787 |
|