| Literature DB >> 31652820 |
Robert Chew1, Jonathan Thornburg2, Darby Jack3, Cara Smith4, Qiang Yang5, Steven Chillrud6.
Abstract
Exposure assessment studies are the primary means for understanding links between exposure to chemical and physical agents and adverse health effects. Recently, researchers have proposed using wearable monitors during exposure assessment studies to obtain higher fidelity readings of exposures actually experienced by subjects. However, limited research has been conducted to link a wearer's actions to periods of exposure, a necessary step for estimating inhaled dosage. To aid researchers in these settings, we developed a machine learning model for identifying periods of bicycling activity using passively collected data from the RTI MicroPEM wearable exposure monitor, a lightweight device capable of continuously sampling both air pollution levels and accelerometry parameters. Our best performing model identifies biking activity with a mean leave-one-session-out (LOSO) cross-validation F1 score of 0.832 (unweighted) and 0.979 (weighted). Accelerometer derived features contributed greatly to the model performance, as well as temporal smoothing of the predicted activities. Additionally, we found competitive activity recognition can occur with even relatively low sampling rates, suggesting suitability for exposure assessment studies where continuous data collection for long periods (without recharge) are needed to capture realistic daily routines and exposures.Entities:
Keywords: air pollution; exposure assessment; human activity recognition; machine learning; wearable sensors
Mesh:
Substances:
Year: 2019 PMID: 31652820 PMCID: PMC6864797 DOI: 10.3390/s19214613
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Study summary statistics across all 21 sessions.
| Statistics Per Session | Mean | Median | Min | Max |
|---|---|---|---|---|
| Records | 992 | 989 | 916 | 1087 |
| Percent Biking | 7.4% | 6.7% | 3.1% | 11.8% |
| X-axis acceleration (m2/s) | 0.84 | 0.95 | –0.95 | 2.26 |
| Y-axis acceleration (m2/s) | 0.06 | 0.02 | –1.81 | 1.33 |
| Z-axis acceleration (m2/s) | 0.02 | 0.03 | –1.37 | 1.82 |
| Temperature (C) | 24.8 | 24.4 | 9.0 | 33.5 |
| Relative Humidity (%) | 50.0 | 50.2 | 17.9 | 94.5 |
| PM2.5 (µg/m3) | 22.4 | 3.0 | 0.0 | 1640.0 |
Final hyperparameter specifications for the models used in this study.
| Model | Specification |
|---|---|
| Mode Smoothed | Window length: 11 |
| Stacked Ensemble | Classifier type: Gradient boosted trees classifier |
| Gradient Boosted Trees | Number of boosting stages: 80 |
| K-Nearest Neighbors | K: 5 |
| Logistic Regression | Regularization: L1-norm |
Model evaluation metrics for each model. The numbers reported are the mean and standard deviation across the 21 LOSO CV hold-out folds [mean (std)] for both weighted and unweighted metric variants. The best results per metric are presented in bold.
| Model | F1 | Precision | Recall | |||
|---|---|---|---|---|---|---|
| Raw | Weighted | Raw | Weighted | Raw | Weighted | |
| Mode Smoothed |
|
|
|
| 0.801 |
|
|
|
|
|
| (0.166) |
| |
| Stacked Ensemble | 0.767 | 0.970 | 0.791 | 0.971 | 0.767 | 0.970 |
| (0.115) | (0.012) | (0.115) | (0.011) | (0.115) | (0.012) | |
| Gradient Boosted Trees | 0.746 | 0.969 | 0.810 | 0.970 | 0.714 | 0.970 |
| (0.151) | (0.011) | (0.103) | (0.010) | (0.193) | (0.010) | |
| K-Nearest Neighbors | 0.744 | 0.965 | 0.780 | 0.967 | 0.732 | 0.965 |
| (0.115) | (0.017) | (0.137) | (0.015) | (0.138) | (0.018) | |
| Logistic Regression | 0.656 | 0.936 | 0.542 | 0.961 |
| 0.923 |
| (0.141) | (0.038) | (0.182) | (0.015) |
| (0.051) | |
Figure 1Model predictions and “ground truth” activities over time for the best performing LOSO fold (a) and worst performing LOSO fold (b). Green areas indicate a correct prediction and red areas indicate a misclassification. Each time unit on the X-axis represents a sliding window of six 30-second readings, with 50% overlap, that was used in the analysis.
Figure 2Normalized feature importance for the gradient boosted tree classifier. The naming convention for the abbreviations on the figure’s X-axis consists of (1) the measurement type recorded by the MicroPEM and (2) the summary statistics for observations contained within each sliding window. For example, “temp_kurt” is shorthand for the kurtosis of the temperature readings contained within the sliding window. The six MicroPEM measurements used in this study are: X-axis acceleration, Y-axis acceleration, Z-axis acceleration, temperature, relative humidity, and RH-corrected nephelometer readings. The nine summary statistics calculated are: mean, median, max, min, standard deviation, skew, kurtosis, 20% percentile, and 80% percentile.