| Literature DB >> 24303099 |
Wei Zhao1, Anne L Adolph, Maurice R Puyau, Firoz A Vohra, Nancy F Butte, Issa F Zakeri.
Abstract
The goal of this study is to develop, test, and compare multinomial logistic regression (MLR) and support vector machines (SVM) in classifying preschool-aged children physical activity data acquired from an accelerometer. In this study, 69 children aged 3-5 years old were asked to participate in a supervised protocol of physical activities while wearing a triaxial accelerometer. Accelerometer counts, steps, and position were obtained from the device. We applied K-means clustering to determine the number of natural groupings presented by the data. We used MLR and SVM to classify the six activity types. Using direct observation as the criterion method, the 10-fold cross-validation (CV) error rate was used to compare MLR and SVM classifiers, with and without sleep. Altogether, 58 classification models based on combinations of the accelerometer output variables were developed. In general, the SVM classifiers have a smaller 10-fold CV error rate than their MLR counterparts. Including sleep, a SVM classifier provided the best performance with a 10-fold CV error rate of 24.70%. Without sleep, a SVM classifier-based triaxial accelerometer counts, vector magnitude, steps, position, and 1- and 2-min lag and lead values achieved a 10-fold CV error rate of 20.16% and an overall classification error rate of 15.56%. SVM supersedes the classical classifier MLR in categorizing physical activities in preschool-aged children. Using accelerometer data, SVM can be used to correctly classify physical activities typical of preschool-aged children with an acceptable classification error rate.Entities:
Keywords: Accelerometers; activity monitoring; classification; multinomial logistic regression classifiers; support vector machines classifiers
Year: 2013 PMID: 24303099 PMCID: PMC3831935 DOI: 10.1002/phy2.6
Source DB: PubMed Journal: Physiol Rep ISSN: 2051-817X
Figure 1K-Mean clustering plot for accelerometer counts (act_X, act_Y, act_Z), steps, and position
Physical activity categories
| Activity category | Description | Position | Number of observations | Original categories |
|---|---|---|---|---|
| 1 | Sleep | Lying | 2618 | Sleep |
| 2 | Rest | Reclining | 3035 | Watching TV |
| 3 | Quiet play | Sitting | 1747 | Coloring, video game, puzzle |
| 4 | Low active play | Standing | 1244 | Kitchen/toys |
| 5 | Moderately active play | Standing | 2569 | Ball toss, active video game, dance, aerobics |
| 6 | Very active play | Standing | 237 | Running in place |
The classification error rates of the models*
| With Sleep Period | Without Sleep Period | |||||
|---|---|---|---|---|---|---|
| MLR | SVM | SVM | ||||
| Model | 10-fold CV Error Rate (%) | Model | 10-fold CV Error Rate (%) | Model | 10-fold CV Error Rate (%) | Overall Classification Error Rate (%) |
| PCO-16 | 28.88 | PCA-18 | 24.90 | PCA-18 | 20.16 | 15.56 |
| PCO-20 | 29.97 | PCO-16 | 25.43 | PCO-16 | 20.33 | 15.69 |
| PCO-17 | 30.26 | PCO-18 | 24.70 | PCO-18 | 20.33 | 15.69 |
| PCO-15 | 32.14 | PCA-16 | 25.58 | PCA-16 | 20.46 | 15.79 |
| PCO-18 | 26.80 | PCA-20 | 27.52 | PCA-20 | 22.01 | 16.98 |
| PCO-19 | 32.81 | PCO-20 | 26.97 | PCO-20 | 22.03 | 17.00 |
A detailed explanation of the structure of the models used in this study can be found in the Appendix. The input feature position was treated either as a categorical variable (PCA) or a continuous variable (PCO). The input features of the models are given in the following:
Model Structure15: act_X + act_Y + act_Z + steps + lag/lead 1-min + position
Model Structure16: act_X + act_Y + act_Z + steps + lag/lead 1- and 2-min + position
Model Structure17: act_X + act_Y + act_Z + vm + steps + lag/lead 1-min + position
Model Structure18: act_X + act_Y + act_Z + vm + steps + lag/lead 1- and 2-min + position
Model Structure19: vm + steps + lag/lead 1-min + position
Model Structure20: vm + steps + lag/lead 1- and 2-min + position
Classification accuracy
| Activity | 1 | 2 | 3 | 4 | 5 | 6 |
|---|---|---|---|---|---|---|
| θ | 91.44 | 65.66 | 74.07 | 68.49 | 93.73 | 98.73 |
| π | 31.55 | 17.30 | 33.03 | 24.92 | 1.91 | 0 |
The confusion matrix
| Activity category | |||||||
|---|---|---|---|---|---|---|---|
| 1 | 2 | 3 | 4 | 5 | 6 | ||
| Predicted class | 1 | 0 | 0 | 0 | 0 | 0 | |
| 2 | 0 | 300 | 20 | 3 | 0 | ||
| 3 | 0 | 398 | 397 | 52 | 0 | ||
| 4 | 0 | 44 | 288 | 161 | 0 | ||
| 5 | 0 | 2 | 14 | 59 | 31 | ||
| 6 | 0 | 0 | 0 | 0 | 4 | ||
The (1,1)-entry of this matrix is zero, because activity category = 1 is sleep and we only applied the classifier to the data without sleep. There are actually 2618 observations in the sleep period, and those observations are considered to be correctly classified.
The bold values are the number of correctly-classified observations.
The model structures
| Model structure | Description |
|---|---|
| Model Structure 1 | activity.category ∼ act_X + act_Y + act_Z |
| Model Structure 2 | activity.category ∼ act_X + act_Y + act_Z + vm |
| Model Structure 3 | activity.category ∼ act_X + act_Y + act_Z + steps |
| Model Structure 4 | activity.category ∼ act_X + act_Y + act_Z + position |
| Model Structure 5 | activity.category ∼ act_X + act_Y + act_Z + vm + steps |
| Model Structure 6 | activity.category ∼ act_X + act_Y + act_Z + vm + position |
| Model Structure 7 | activity.category ∼ act_X + act_Y + act_Z + steps + position |
| Model Structure 8 | activity.category ∼ act_X + act_Y + act_Z + vm + steps + position |
| Model Structure 9 | activity.category ∼ vm + steps |
| Model Structure 10 | activity.category ∼ vm |
| Model Structure 11 | activity.category ∼ steps |
| Model Structure 12 | activity.category ∼ vm + steps + position |
| Model Structure 13 | activity.category ∼ vm + position |
| Model Structure 14 | activity.category ∼ steps + position |
| Model Structure 15 | activity.category ∼ act_X + act_Y + act_Z + steps + position + lag/lead 1-[act_X + act_Y + act_Z + steps] |
| Model Structure 16 | activity.category ∼ act_X + act_Y + act_Z + steps + position + lag/lead 1/2-[act_X + act_Y + act_Z + steps] |
| Model Structure 17 | activity.category ∼ act_X + act_Y + act_Z + vm + steps + position + lag/lead 1-[act_X + act_Y + act_Z + vm + steps] |
| Model Structure 18 | activity.category ∼ act_X + act_Y + act_Z + vm + steps + position + lag/lead 1/2-[act_X + act_Y + act_Z + vm + steps] |
| Model Structure 19 | activity.category ∼ vm + steps + position + lag/lead 1-[vm + steps] |
| Model Structure 20 | activity.category ∼ vm + steps + position + lag/lead 1/2-[vm + steps] |
The model structures were developed in a step-wise manner: first, we included the triaxial accelerometer outputs (act_X, act_Y, act_Z) from the device in the model structure (model structure 1). Then, we gradually included other features (vm, steps, and position) in the subsequent model structures (model structure 2–8). Since vm is a summary of the triaxial accelerometer outputs, we built a model structure based on only vm (model structure 10). Steps are another important feature and thus we built a model structure (model structure 11) based on it. Next, we added more features (steps and position) to vm and/or steps and developed model structures 9, 12–14. Finally, we added the lag and lead values of the input features to the best-performing models (model structures 15–20).
The performance of the model
| Rank | Model | 10-fold CV error rate (%) | Rank | Model | 10-fold CV error rate (%) |
|---|---|---|---|---|---|
| 1 | 30 | S-PCA-12 | 34.57 | ||
| 2 | 31 | S-PCO-6 | 35.91 | ||
| 2 | 32 | S-PCA-6 | 36.06 | ||
| 4 | 33 | L-PCO-8 | 36.31 | ||
| 5 | 34 | S-PCO-4 | 36.56 | ||
| 6 | 35 | S-PCA-13 | 36.56 | ||
| 7 | S-PCO-18 | 24.70 | 36 | S-PCA-4 | 36.600 |
| 8 | S-PCA-18 | 24.90 | 37 | S-PCO-13 | 36.78 |
| 9 | S-PCO-16 | 25.43 | 38 | L-PCO-7 | 36.98 |
| 10 | S-PCA-16 | 25.58 | 39 | S-PCA-14 | 37.14 |
| 11 | L-PCO-18 | 26.80 | 40 | S-PCO-14 | 37.21 |
| 12 | S-PCO-20 | 26.97 | 41 | L-PCO-12 | 37.43 |
| 13 | S-PCA-20 | 27.52 | 42 | S-5 | 39.11 |
| 14 | S-PCO-17 | 28.09 | 43 | S-3 | 39.49 |
| 15 | S-PCA-17 | 28.73 | 44 | L-PCO-6 | 39.50 |
| 16 | L-PCO-16 | 28.88 | 45 | L-5 | 39.89 |
| 17 | S-PCO-15 | 28.95 | 46 | L-PCO-4 | 39.90 |
| 18 | S-PCA-15 | 29.30 | 47 | S-9 | 39.99 |
| 19 | S-PCA-19 | 29.42 | 48 | L-PCO-13 | 41.12 |
| 20 | S-PCO-19 | 29.42 | 49 | S-2 | 41.32 |
| 21 | L-PCO-20 | 29.97 | 50 | L-3 | 41.78 |
| 22 | L-PCO-17 | 30.26 | 51 | S-1 | 41.73 |
| 23 | L-PCO-15 | 32.14 | 52 | L-9 | 42.01 |
| 24 | L-PCO-19 | 32.81 | 53 | L-PCO-14 | 42.01 |
| 25 | S-PCA-8 | 33.73 | 54 | L-2 | 43.00 |
| 26 | S-PCO-8 | 33.84 | 55 | L-1 | 45.00 |
| 27 | S-PCO-7 | 34.15 | 56 | L-10 | 45.71 |
| 28 | S-PCA-7 | 34.18 | 57 | S-11 | 46.2 |
| 29 | S-PCO-12 | 34.55 | 58 | L-11 | 46.47 |
The bold values are the number of correctly-classified observations.