| Literature DB >> 35271034 |
Farida Sabry1, Tamer Eltaras1, Wadha Labda1, Fatima Hamza1, Khawla Alzoubi2, Qutaibah Malluhi1.
Abstract
With the ongoing advances in sensor technology and miniaturization of electronic chips, more applications are researched and developed for wearable devices. Hydration monitoring is among the problems that have been recently researched. Athletes, battlefield soldiers, workers in extreme weather conditions, people with adipsia who have no sensation of thirst, and elderly people who lost their ability to talk are among the main target users for this application. In this paper, we address the use of machine learning for hydration monitoring using data from wearable sensors: accelerometer, magnetometer, gyroscope, galvanic skin response sensor, photoplethysmography sensor, temperature, and barometric pressure sensor. These data, together with new features constructed to reflect the activity level, were integrated with personal features to predict the last drinking time of a person and alert the user when it exceeds a certain threshold. The results of applying different models are compared for model selection for on-device deployment optimization. The extra trees model achieved the least error for predicting unseen data; random forest came next with less training time, then the deep neural network with a small model size, which is preferred for wearable devices with limited memory. Embedded on-device testing is still needed to emphasize the results and test for power consumption.Entities:
Keywords: dehydration detection; electro-dermal activity; hydration monitoring; machine learning; on-device; photoplethysmography; skin response; wearable devices
Mesh:
Year: 2022 PMID: 35271034 PMCID: PMC8914724 DOI: 10.3390/s22051887
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1Pipeline for on-device personalized private learning for dehydration detection.
Figure 2Accelerometer, magnetometer, and gyroscope sample magnitude data before preprocessing.
Accelerometer specification.
|
| STMicro LSM303DLHC |
|
| +/−16 g |
|
| 3 Channels ( |
|
| 512 Hz |
|
| None |
Magnetometer specification.
|
| STMicro LSM303DLHC |
|
| +/−1.9 Ga |
|
| 3 Channels ( |
|
| 512 Hz |
|
| None |
Gyroscope specification.
|
| Invensense MPU9150 |
|
| +/−500 deg/s |
|
| 3 Channels ( |
|
| 512 Hz |
|
| None |
Figure 3Galvanic skin response/electrodermal activity sample data.
GSR specification.
|
| 1 Channel (GSR) |
|
| 128 Hz |
|
| 16 bits, signed |
|
| kOhms |
|
| None |
PPG specification.
|
| 1 Channel (PPG) |
|
| 128 Hz |
|
| 16 bits, signed |
|
| mV |
|
| None |
Figure 4PPG sample data.
Figure 5Preprocessing of the PPG signal and the extraction of features.
Descriptive statistics for the collected dataset.
| Subject | Total Recorded Duration | Maximum Fasting Duration | Mean HR | |Accelerometer| | |Magnetometer| | |Gyroscope| |
|---|---|---|---|---|---|---|
| S1 | 1768 | 15.3 | 80 | (5.9, 11.9, 27.7) | (0.3, 2.3, 64.7) | (1.5, 25.3, 574.2) |
| S2 | 636 | 13.2 | 67 | (6.4, 10.8, 27.7) | (0.6, 2.7, 71.4) | (1.6, 26.8, 616.3) |
| S3 | 182 | 15 | 86 | (8.1, 11.3, 12.1) | (0.3, 0.9, 1.6) | (2.7, 5.2, 76.9) |
| S4 | 573 | 12.1 | 83 | (8, 11.6, 27.7) | (0.4, 1.8, 53.9) | (1.5, 20.1, 458.1) |
| S5 | 82 | 11 | 81 | (5.8, 11.3, 12.1) | (0.4, 0.9, 1.3) | (1.7, 25.5, 78.6) |
| S6 | 23 | 11.2 | 84 | (9.5, 11.4, 12.3) | (0.4, 0.6, 0.7) | (3.4, 23.9, 57.7) |
| S7 | 24 | 1.4 | 61 | (10.8, 11.9, 12.3) | (0.4, 0.5, 0.8) | (1.7, 4.3, 17.7) |
| S8 | 21 | 1.4 | 81 | (6.9, 8.4, 11) | (0.6, 1, 1.1) | (1.7, 13, 33.2) |
| S9 | 23 | 1.4 | 84 | (10.9, 11.8, 12) | (0.4, 0.6, 0.7) | (1.6, 2.4, 9.6) |
| S10 | 22 | 1.4 | 118 | (10.1, 10.9, 11.3) | (0.6, 0.9, 1) | (8.1, 20.1, 66.3) |
| S11 | 32 | 1.5 | 85 | (10.5, 11.7, 12.3) | (0.4, 0.5, 0.6) | (1.7, 6.4, 27.6) |
List of features.
| Feature | Description | Mean | VIF |
|---|---|---|---|
| PPG_A13_CAL | Raw PPG sensor calibrated values | 2925 | 5.97 |
| bpm | Estimated pulse rate (beats per minute) | 78 | 5.57 |
| ibi | Inter-beat interval (IBI) estimated from the PPG signal | 772 | 5.53 |
| breathingrate | Estimated breathing rate from the PPG signal | 0.22 | 1.08 |
| RMSSD | Root mean square of successive differences between estimated heartbeats | 75.2 | 1.08 |
| GSR_Skin_Resistance_CAL | Raw calibrated GSR resistance (kOhms) | 9923 | 1.21 |
| GSR_Skin_Conductance_CAL | Raw calibrated GSR conductance (μS) | 5 | 1.31 |
| Accel_mag | Magnitude of the accelerometer, Equation ( | 11.5 | 5.19 |
| Mag_mag | Magnitude of the magnetometer, Equation ( | 2 | 7.81 |
| Gyro_mag | Magnitude of the gyroscope, Equation ( | 22.3 | 7.87 |
| cumAccel | Cumulative accelerometer change, Equation ( | 2970 | 6.34 |
| cumMag | Cumulative magnetometer change, Equation ( | 3401 | 7.08 |
| cumGyro | Cumulative gyroscope change, Equation ( | 46,318 | 6.02 |
| Temperature_BMP280_CAL | Surrounding temperature | 34.9 | 1.53 |
| Pressure_BMP280_CAL | Atmospheric pressure | 99.7 | 1.6 |
| age | Age of the subject | 30 | 1.14 |
| height | Height of the subject (cm) | 159 | 11.23 |
| weight | Weight of the subject (kg) | 63 | 12.11 |
| gender | Gender of the subject (male/female) | N/A | 1.8 |
Figure 6Features’ correlation matrix (please refer to Table 2 for the meaning of the features).
Figure 7Features’ importance with random forest.
Comparison of the machine learning techniques for FEAT1.
| Model | MAE | RMSE | Training Time (s) | Size (MB) |
|---|---|---|---|---|
| Baseline | 2.78 | 4.43 | 0.00 | 0.00 |
| Linear Regression | 2.74 | 3.45 | 0.00 | 0.0007 |
| Lasso | 2.75 | 3.45 | 0.08 | 0.005 |
| Ridge Regression | 2.74 | 3.45 | 0.00 | 0.0007 |
| ElasticNet Regression | 3.02 | 3.64 | 0.00 | 0.0007 |
| SVR | 1.72 | 3.06 | 0.21 | 0.321 |
| ANN | 2.64 | 3.54 | 8.29 | 0.061 |
| Gradient Boosting | 1.99 | 2.57 | 0.14 | 0.028 |
|
| 1.51 | 2.50 | 10.52 |
|
|
| 0.41 | 0.98 |
| 9.58 |
|
|
|
| 16.41 | 30.4 |
Figure 8Comparison of machine learning techniques in terms of the MAE and RMSE for FEAT1.
Comparison of the machine learning techniques for FEAT2.
| Model | MAE | RMSE | Training Time (s) | Size (MB) |
|---|---|---|---|---|
| Baseline | 2.91 | 4.64 | 0.00 | 0.00 |
| Linear Regression | 3.08 | 3.78 | 0.00 | 0.00 |
| Lasso | 3.41 | 4.04 | 0.11 | 0.01 |
| Ridge Regression | 3.08 | 3.78 | 0.01 | 0.00 |
| ElasticNet Regression | 3.10 | 3.79 | 0.01 | 0.00 |
| SVR | 2.81 | 4.50 | 0.27 | 0.23 |
| ANN | 3.08 | 3.78 | 15.11 | 0.06 |
| Gradient Boosting | 2.14 | 2.74 | 0.09 | 0.03 |
|
| 1.14 | 1.90 | 18.40 |
|
|
| 0.36 | 0.84 |
| 9.15 |
|
|
|
| 9.38 | 28.94 |
Transfer learning results at different test sizes (0.3 and 0.7 of the subject’s samples).
| MAE (0.3/0.7) | RMSE (0.3/0.7) | Training Time (0.3/0.7) | Model Size (0.3/0.7) | |
|---|---|---|---|---|
| DNN | 2.41/3.12 | 3.29/4.39 | 6.17/4.8 | 0.31/0.31 |
| Random Forest | 0.58/0.7 | 0.90/1.2 |
|
|
| Extra Trees |
|
| 0.50/0.17 | 2.64/1.14 |
Comparison of this study and related studies for dehydration monitoring.
| Study | Problem | Features | Techniques | Dataset | Model Size | Training Time |
|---|---|---|---|---|---|---|
| [ | Classification into three classes: rest before exercise, post-exercise, and after hydration | RR interval, RMSSD, and SDRR of ECG | SVM and K-means | 30 min ECG (10 min for each class) for 16 athletes (total = 480 min) | No | No |
| [ | Classification into hydrated/mild dehydrated | 9 features from EDA and PPG | LDA, QDA, logistic regression, SVM, fine and medium Gaussian kernel, K-NN, decision trees, and ensemble of K-NNs | 8 min EDA and PPG for each of 17 subjects (total = 136 min) | No | No |
| [ | Classification into hydrated/dehydrated | 9 statistical features from the GSR signal | Logistic regression, SVM, decision trees, K-NN, LDA, Naive Bayes | 2 h of EDA signal for each of 5 subjects (total = 600 min) | No | No |
| [ | Classification into hydrated/dehydrated | Combinations of 6 statistical features extracted from the GSR signal | Logistic regression, random forest, K-NN, naive Bayes, decision trees, LDA, AdaBoost classifier, and QDA | EDA signals for 5 subjects, but they did not mention the recorded time | No | No |
| [ | Classification into well-hydrated, hydrated, dehydrated, and very dehydrated | 12 features extracted from EDA and 2 features from the activity recognition model | Random forest, decision trees, naive Bayes, BayesNet, and multilayer perceptron | 24 h for 5 d of EDA signals, as well as activity labeling for 5 subjects (total = 36,000 min) | No | No |
| This study | Regression for the number of hours since last drinking | FEAT1(19) and FEAT2(12) as described in | Linear, lasso, ridge, and ElasticNet regression, SVR, ANN, gradient boosting, DNN, random forest, and Extra Trees | Total of 3386 min for 11 subjects | Yes | Yes |
Figure 9Shapley values for the random forest model using SHAP to show the impact of each feature on the model output (please refer to Table 2 for the meaning of the features).