| Literature DB >> 25928060 |
Abstract
This paper presents the results of research on the use of smartphone sensors (namely, GPS and accelerometers), geospatial information (points of interest, such as bus stops and train stations) and machine learning (ML) to sense mobility contexts. Our goal is to develop techniques to continuously and automatically detect a smartphone user's mobility activities, including walking, running, driving and using a bus or train, in real-time or near-real-time (<5 s). We investigated a wide range of supervised learning techniques for classification, including decision trees (DT), support vector machines (SVM), naive Bayes classifiers (NB), Bayesian networks (BN), logistic regression (LR), artificial neural networks (ANN) and several instance-based classifiers (KStar, LWLand IBk). Applying ten-fold cross-validation, the best performers in terms of correct classification rate (i.e., recall) were DT (96.5%), BN (90.9%), LWL (95.5%) and KStar (95.6%). In particular, the DT-algorithm RandomForest exhibited the best overall performance. After a feature selection process for a subset of algorithms, the performance was improved slightly. Furthermore, after tuning the parameters of RandomForest, performance improved to above 97.5%. Lastly, we measured the computational complexity of the classifiers, in terms of central processing unit (CPU) time needed for classification, to provide a rough comparison between the algorithms in terms of battery usage requirements. As a result, the classifiers can be ranked from lowest to highest complexity (i.e., computational cost) as follows: SVM, ANN, LR, BN, DT, NB, IBk, LWL and KStar. The instance-based classifiers take considerably more computational time than the non-instance-based classifiers, whereas the slowest non-instance-based classifier (NB) required about five-times the amount of CPU time as the fastest classifier (SVM). The above results suggest that DT algorithms are excellent candidates for detecting mobility contexts in smartphones, both in terms of performance and computational complexity.Entities:
Year: 2015 PMID: 25928060 PMCID: PMC4481999 DOI: 10.3390/s150509962
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1Overview of the supervised learning process, as used in this study.
Test runs comprising the dataset. W = walking; R = running; S = static; MS = moving slowly; RT = riding train; RB = riding bus; D = driving.
| Subject 1 | 582 | 10:03 | W, S |
| Subject 1 | 292 | 11:32 | W, S |
| Subject 2 | 364 | 6:21 | W, S |
| Subject 2 | 636 | 15:09 | W, S, MS, RT |
| Subject 2 | 269 | 10:48 | W, S |
| Subject 2 | 654 | 11:08 | W, S, RT |
| Subject 2 | 583 | 9:46 | W, S, MS |
| Subject 1 | 1266 | 21:42 | W, S, MS, RB |
| Subject 1 | 278 | 4:39 | W, S, R |
| Subject 2 | 176 | 2:56 | W, S, R |
| Subject 1 | 293 | 4:53 | W, S, R |
| Subject 1 | 765 | 16:38 | W, S, RB |
| Subject 2 | 1017 | 17:11 | W, S |
| Subject 1 | 593 | 10:41 | W, S, D |
| Subject 1 | 730 | 12:34 | W, S, D |
|
| |||
Figure 2Screenshot of the CommutingContext application.
Distribution of the mobility classes.
| static | 1144 | 13.5% |
| moving slowly | 297 | 3.5% |
| walking | 3443 | 40.5% |
| running | 532 | 6.3% |
| driving a car | 1135 | 13.4% |
| riding a bus | 1482 | 17.4% |
| riding a train | 465 | 5.5% |
Description of the features.
| speed | GPS | Speed from GPS, converted to km/h. |
| speedChange | GPS | Difference in speed from previous GPS measurement. |
| accelVariance | accelerometer | Variance in the norm of the acceleration ( |
| headingChange | GPS | Sum of absolute value of heading changes over the moving window. |
| accuracy | GPS | Accuracy rating from GPS, assumed to be meters of horizontalerror. |
| trainDistance | GPS/GIS | Distance to the nearest train station in meters. |
| busDistance | GPS/GIS | Distance to the nearest bus stop in meters. |
| 1 HzPeak | accelerometer | Strength of 10-Hz peak in FFT of the accelVariance signal. |
Comparison of algorithm performance.
| RandomForest | 96.51 | 0.59 |
| NBTree | 95.40 | 0.71 |
| J48graft | 94.93 | 0.77 |
| J48 | 94.83 | 0.81 |
| RandomTree | 94.45 | 0.93 |
| LMT | 93.25 | 1.29 |
| FT | 92.66 | 1.02 |
| BFTree | 93.77 | 0.83 |
| REPTree | 93.58 | 1.06 |
| LADTree | 85.37 | 1.29 |
| SimpleCart | 94.00 | 0.83 |
|
| ||
| MultilayerPerceptron | 87.15 | 1.80 |
|
| ||
| NaiveBayes | 81.47 | 1.15 |
| BayesNet | 90.89 | 0.93 |
|
| ||
| Logistic | 83.41 | 1.05 |
|
| ||
| SMO | 80.24 | 1.02 |
|
| ||
| KStar | 95.55 | 0.70 |
| IBk(kNN = 5) | 80.32 | 1.25 |
| LWL | 95.45 | 0.14 |
|
| ||
| ZeroR | 40.52 | 0.06 |
Feature selection analysis. LR, logistic regression; NB, naive Bayes.
|
| ||||||
| speed | 72.9 | 73.8 | 71.4 | 67.7 | 71.5 | 55.5 |
| accelVariance | 64.2 | 70.7 | 62.2 | 57.4 | 55.6 | 57.3 |
| trainDistance | 61.6 | 69.0 | 56.1 | 51.7 | 53.9 | 50.7 |
| speedChange | 50.7 | 50.7 | 44.8 | 40.5 | 44.0 | 40.5 |
| busDistance | 49.7 | 56.9 | 45.6 | 43.4 | 43.3 | 42.9 |
| accuracy | 47.1 | 47.0 | 46.7 | 43.9 | 43.9 | 44.0 |
| 1HzPeak | 46.3 | 51.6 | 49.8 | 45.0 | 43.6 | 40.5 |
| headingChange | 41.3 | 41.1 | 49.1 | 40.5 | 39.7 | 40.5 |
|
| ||||||
|
| ||||||
| speedChange | 97.7 | 95.9 | 90.1 | 87.1 | 79.7 | 85.0 |
| headingChange | 97.1 | 95.5 | 87.8 | 85.5 | 81.6 | 81.9 |
| accelVariance | 96.5 | 94.3 | 85.6 | 81.6 | 81.2 | 74.3 |
| accuracy | 96.2 | 94.3 | 85.9 | 83.6 | 79.5 | 78.9 |
| speed | 96.1 | 94.2 | 85.2 | 79.8 | 74.1 | 75.5 |
| busDistance | 96.1 | 94.8 | 86.5 | 82.7 | 82.0 | 78.7 |
| 1HzPeak (all time-domain features) | 96.5 | 94.8 | 87.1 | 83.4 | 81.5 | 80.2 |
| trainDistance | 94.3 | 92.9 | 83.1 | 82.1 | 78.8 | 79.4 |
|
| ||||||
|
| ||||||
| speedChange | 96.7 | 94.9 | 86.9 | 83.5 | 81.4 | 80.2 |
| headingChange | 96.5 | 94.8 | 86.9 | 83.4 | 81.4 | 80.4 |
| accelVariance | 96.1 | 93.5 | 83.8 | 78.3 | 78.7 | 73.5 |
| accuracy | 95.6 | 93.6 | 85.1 | 81.8 | 78.0 | 77.6 |
| busDistance | 95.1 | 93.8 | 84.6 | 80.8 | 79.5 | 77.0 |
| speed | 94.6 | 92.1 | 83.0 | 77.6 | 72.9 | 73.1 |
| trainDistance | 93.3 | 92.0 | 82.4 | 81.1 | 76.7 | 77.8 |
|
| ||||||
|
| ||||||
| {av, s, t, b, acc, 1hp} | 97.6 | 96.0 | 90.2 | 87.1 | 79.5 | 85.0 |
| {av, s, t, b, acc} | 96.5 | 94.9 | 86.6 | 83.5 | 81.2 | 80.2 |
| {av, s, t, b, 1hp} | 96.3 | 94.6 | 85.5 | 83.5 | 78.6 | 79.1 |
| {av, t, b, acc, 1hp} | 96.1 | 94.5 | 82.5 | 79.9 | 70.1 | 75.7 |
| {av, s, t, b} | 93.8 | 93.8 | 84.5 | 81.8 | 77.5 | 77.6 |
| {av, s, h, acc, sc, hc, 1hp} (no GIS) | 93.8 | 92.4 | 84.8 | 80.6 | 79.4 | 76.3 |
| {av, s, t} | 92.4 | 92.4 | 81.0 | 77.4 | 76.3 | 72.0 |
| {b, t, 1hp} | 89.6 | 88.4 | 67.4 | 62.3 | 55.2 | 57.4 |
| {av, s, b} | 89.4 | 89.4 | 80.8 | 79.0 | 73.4 | 75.0 |
|
| ||||||
Figure 3Performance of Random Forest as a function of parameter settings.
Comparison of computational time.
| RandomForest | 7.96 | 7.84 |
| NBTree | 19.3 | 7.05 |
| J48graft | 3.74 | 6.70 |
| J48 | 1.25 | 4.25 |
| RandomTree | 1.40 | 4.49 |
| LMT | 117 | 6.96 |
| FT | 762 | 54.5 |
| BFTree | 1.72 | 4.90 |
| REPTree | 1.09 | 4.00 |
| LADTree | 1.09 | 4.00 |
| SimpleCart | 1.25 | 4.25 |
|
| ||
| MultilayerPerceptron | 3.28 | 6.39 |
|
| ||
| NaiveBayes | 13.26 | 6.79 |
| BayesNet | 6.55 | 8.05 |
|
| ||
| Logistic | 4.21 | 6.96 |
|
| ||
| SMO | 2.65 | 5.90 |
|
| ||
| KStar | 7.21 × 104 | 2.74 × 104 |
| IBk (kNN = 5) | 751 | 16.9 |
| LWL | 8.60 × 105 | 7.89 × 104 |
Confusion matrix for the RandomForest algorithm applied to the dataset.
|
| ||||||||
|---|---|---|---|---|---|---|---|---|
| 0.572 | 0.145 | 0.000 | 0.032 | 0.383 | 0.006 | |||
| 3.024 | 1.792 | 0.079 | 0.420 | 0.236 | 0.638 | |||
| 2.929 | 7.946 | 0.000 | 0.875 | 0.000 | 0.404 | |||
| 0.000 | 0.323 | 0.000 | 1.742 | 0.022 | 0.043 | |||
| 0.297 | 0.499 | 0.115 | 0.803 | 0.007 | 1.174 | |||
| 6.184 | 0.959 | 0.000 | 0.019 | 0.019 | 0.000 | |||
| 0.009 | 1.445 | 0.000 | 0.326 | 2.326 | 0.000 | |||
Figure 4Plot of accelVariance vs. speed for all mobility classes.
Figure 5Same plot for only “walking” and “running” classes.