| Literature DB >> 31627310 |
Petra Jones1,2, Evgeny M Mirkes3, Tom Yates4,5, Charlotte L Edwardson6,7, Mike Catt8, Melanie J Davies9,10,11, Kamlesh Khunti12,13, Alex V Rowlands14,15,16.
Abstract
Few methods for classifying physical activity from accelerometer data have been tested using an independent dataset for cross-validation, and even fewer using multiple independent datasets. The aim of this study was to evaluate whether unsupervised machine learning was a viable approach for the development of a reusable clustering model that was generalisable to independent datasets. We used two labelled adult laboratory datasets to generate a k-means clustering model. To assess its generalised application, we applied the stored clustering model to three independent labelled datasets: two laboratory and one free-living. Based on the development labelled data, the ten clusters were collapsed into four activity categories: sedentary, standing/mixed/slow ambulatory, brisk ambulatory, and running. The percentages of each activity type contained in these categories were 89%, 83%, 78%, and 96%, respectively. In the laboratory independent datasets, the consistency of activity types within the clusters dropped, but remained above 70% for the sedentary clusters, and 85% for the running and ambulatory clusters. Acceleration features were similar within each cluster across samples. The clusters created reflected activity types known to be associated with health and were reasonably robust when applied to diverse independent datasets. This suggests that an unsupervised approach is potentially useful for analysing free-living accelerometer data.Entities:
Keywords: accelerometer; clustering; machine learning; physical activity; unsupervised; walking; wrist-worn
Mesh:
Year: 2019 PMID: 31627310 PMCID: PMC6832944 DOI: 10.3390/s19204504
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Characteristics of development and independent datasets.
| Dataset | Sample | Participants | Sampling Rate (Hz) | Monitor | Age (y) | Height (cm) | Mass (kg) | Handedness | Monitor Location |
|---|---|---|---|---|---|---|---|---|---|
| Dev 1 | Lab: Adult | 60 (62%) | 80 | GENEA | 40–65 | 176.2 | 80.6 (11.6) | 55R, 5L | Both wrists |
| Dev 2 | Lab: Adult | 30 (73%) | 100 | GENEActiv | 20–40 | 169.4 (0.1) | 69.2 (15.3) | 27R, 3L | Non-dom. |
| Ind 1 | Lab: Child | 41 (59%) | 80 | GENEA | 9–14 | 150.2 | 43.0 (11.2) | 37R, 2L, 2A | Both wrists |
| Ind 2 | Lab: Adult | 23 (70%) | 100 | GENEActiv | 19–42 | 172.7 | 73.7 (13.0) | 18R, 5L | Non-dom. |
| Ind 3 | Free-Living: Adult (3 h) | 6 (33%) | 100 | GENEActiv | 20–29 | 171.5 | 73.0 (17.1) | 6R | Non-dom. |
| Ind 3 | Free-Living: Adult (24 h) | 8 (62.5%) | 100 | GENEActiv | 20–29 | 166.8 | 65.4 (11.3) | 8R | Non-dom. |
Handedness L = left, R = right, A = ambidextrous. Non-dom. = non-dominant wrist. Lab = laboratory dataset. Dev = development dataset, Ind = independent dataset.
Figure 1Orientation of the GENEA/GENEActiv axes when worn on the non-dominant wrist with the hand (a) level and (b) hanging vertically.
Loading (Pearson’s correlation) on acceleration features included in the model (development dataset).
| Acceleration Feature | Loading | Acceleration Feature | Loading | ||
|---|---|---|---|---|---|
| Frequency | Dominant Frequency | −0.271 | Angle | X—Minimum | −0.389 |
| Magnitude | X—Minimum | −0.285 | X—Median | −0.176 | |
| X—Maximum | 0.233 | X—Mean | −0.175 | ||
| X—Standard Deviation | 0.195 | X—Maximum | 0.220 | ||
| Y—Minimum | −0.286 | X—Standard Deviation | 0.394 | ||
| Y—Standard Deviation | 0.182 | Y—Minimum | −0.241 | ||
| Z—Minimum | −0.194 | Y—Maximum | 0.169 | ||
| Z—75th Percentile | 0.209 | Y—Standard Deviation | 0.329 | ||
| Z—Maximum | 0.358 | Z—Minimum | −0.183 | ||
| Z—Standard Deviation | 0.440 | Z—Median | 0.124 | ||
| Z—Variance | 0.262 | Z—Mean | 0.125 | ||
| Z—Maximum | 0.340 | ||||
| Z—Standard Deviation | 0.526 | ||||
Purity matrix (percentage of each class found within each cluster (A–J)) for the development dataset (two combined adult datasets). (Key statistics highlighted in bold)
| Ambulatory | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Sedentary | Mixed | Slow | Brisk | Running | ||||||
| A | B | C | D | E | F | G | H | I | J | Class |
| 9.1 | 3.9 | 6.9 | 4.4 | 12.2 | 15.7 | 11.6 | 7.2 | 24.2 | 4.8 | % of total time |
|
|
|
|
|
| 1.4 | 2.0 | 0.2 | 0.1 | 0.0 | Lying |
|
|
|
|
|
| 12.2 | 2.6 | 2.4 | 0.7 | 0.1 | Seated |
| 0.4 | 0.0 | 0.6 | 0.3 | 0.0 |
|
| 2.4 | 1.1 | 0.0 | Standing |
| 8.1 | 0.1 | 0.8 | 0.0 | 1.4 |
|
|
| 15.6 | 2.8 | Household |
| 2.6 | 0.1 | 0.3 | 0.0 | 0.6 | 0.0 |
|
| 4.2 | 6.3 | Indoor walking |
| 0.0 | 0.0 | 0.6 | 0.0 | 0.1 | 16.0 | 1.8 | 3.5 |
| 0.3 | Treadmill walking |
| 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.4 | 2.5 | 9.5 |
| 2.2 | Brisk outdoor walk |
| 1.4 | 0.0 | 0.1 | 0.0 | 0.1 | 17.1 | 1.8 | 4.0 |
| 0.2 | Stairs |
| 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.8 | 0.0 | 1.2 | 1.2 |
| Running |
Purity matrices for independent datasets 1 to 3 (percentage of each class found within each cluster (A–J).
| Ambulatory | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Sedentary | Mixed | Slow | Brisk | Running | ||||||
| Independent Sample 1: Child Laboratory | ||||||||||
| A | B | C | D | E | F | G | H | I | J | Class |
| 6.5 | 13.1 | 19.9 | 0.2 | 5.5 | 12.3 | 5.7 | 2.5 | 14.7 | 19.7 | % of total time |
| 6.7 | 24.0 | 29.3 | 0.5 | 11.4 | 15.6 | 5.0 | 3.5 | 3.1 | 0.9 | Lying |
| 31.1 | 15.1 | 38.9 | 0.0 | 0.5 | 8.7 | 0.3 | 3.7 | 1.5 | 0.3 | Seated |
| 0.0 | 0.0 | 4.6 | 0.0 | 0.0 | 18.2 | 0.2 | 12.6 | 57.7 | 6.7 | Treadmill walking |
| 0.0 | 0.0 | 3.9 | 0.0 | 0.0 | 0.3 | 0.1 | 4.5 | 3.9 | 87.3 | Running |
|
| ||||||||||
| 20.1 | 7.6 | 7.5 | 3.4 | 1 7.0 | 3.8 | 24.3 | 10.1 | 4.7 | 1.7 | % of total time |
| 25.5 | 10.2 | 8.7 | 4.6 | 22.5 | 4.8 | 1.9 | 16.0 | 5.2 | 0.5 | Seated |
| 14.3 | 0.0 | 13.0 | 0.2 | 2.4 | 2.2 | 53.4 | 12.3 | 1.3 | 0.9 | Standing |
| 4.2 | 0.2 | 2.1 | 0.1 | 1.2 | 0.4 | 16.9 | 66.4 | 2.1 | 6.4 | Household |
| 0.0 | 0.0 | 0.0 | 0.0 | 0.2 | 0.2 | 62.1 | 25.7 | 7.8 | 4.0 | Indoor walking |
|
| ||||||||||
| 20.2 | 2.2 | 6.2 | 2.4 | 15.9 | 8.6 | 27.2 | 6.8 | 7.4 | 3.1 | % of total time |
| 19.7 | 3.6 | 7.5 | 3.9 | 23.7 | 12.4 | 14.8 | 2.1 | 11.4 | 0.8 | Sedentary |
| 26.2 | 0.2 | 5.5 | 0.3 | 7.7 | 4.0 | 38.1 | 10.4 | 1.9 | 5.8 | Standing |
| 14.4 | 0.1 | 3.3 | 0.1 | 1.6 | 2.3 | 52.6 | 17.3 | 1.5 | 6.8 | Stepping |
Independent dataset 3 (free-living). Percentage of total daily time found in each cluster (A–J).
| Ambulatory | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Sedentary | Mixed | Slow | Brisk | Running | |||||||
| Independent Sample 3b: Free-Living ( | |||||||||||
| A | B | C | D | E | F | G | H | I | J | Class | |
| Mean | 19.2 | 13.1 | 13.9 | 6.1 | 9.7 | 9.2 | 16.8 | 3.5 | 5.8 | 2.7 | % of total time |
| SD | 11.5 | 7.7 | 9.7 | 2.5 | 4.7 | 5.6 | 4.0 | 1.6 | 1.8 | 1.4 | |
Comparison of total daily minutes spent in activity type categories with the activPAL data.
| Clusters | ActivPAL Sedentary | Clusters | ActivPAL Stand/Step | Clusters | ActivPAL | |
|---|---|---|---|---|---|---|
| Mean (min) | 1024.90 * | 1117.54 | 293.19 | 280.98 | 121.17 * | 41.49 |
| SD | 67.38 | 64.86 | 63.59 | 65.22 | 38.23 | 32.19 |
| Bias | −92.6 | 12.2 | 79.7 | |||
| 95% LoA | 98.1 | 132.1 | 80.2 |
* sig different from activPAL (p < 0.05).
Figure 2Characteristics of the acceleration features with the highest loadings by cluster and by sample, time domain: (a) maximum acceleration in the Z-axis and (b) standard deviation of acceleration in the Z-axis.
Figure 3Characteristics of the acceleration features with the highest loadings by cluster and by sample, accelerometer orientation features: (a) minimum angle of the X-axis acceleration relative to the horizontal plane and (b) maximum angle of the Z-axis acceleration relative to the horizontal plane.
Figure 4Characteristics of the features with the highest loadings by cluster and by sample, standard deviation of the accelerometer orientation metrics: (a) X-axis acceleration relative to the horizontal plane, (b) Y-axis acceleration relative to the horizontal plane, and (c) Z-axis acceleration relative to the horizontal plane.