| Literature DB >> 28538708 |
Anna Shcherbina1, C Mikael Mattsson2,3, Daryl Waggott4,5, Heidi Salisbury6, Jeffrey W Christle7, Trevor Hastie8,9, Matthew T Wheeler10,11, Euan A Ashley12,13,14.
Abstract
The ability to measure physical activity through wrist-worn devices provides an opportunity for cardiovascular medicine. However, the accuracy of commercial devices is largely unknown. The aim of this work is to assess the accuracy of seven commercially available wrist-worn devices in estimating heart rate (HR) and energy expenditure (EE) and to propose a wearable sensor evaluation framework. We evaluated the Apple Watch, Basis Peak, Fitbit Surge, Microsoft Band, Mio Alpha 2, PulseOn, and Samsung Gear S2. Participants wore devices while being simultaneously assessed with continuous telemetry and indirect calorimetry while sitting, walking, running, and cycling. Sixty volunteers (29 male, 31 female, age 38 ± 11 years) of diverse age, height, weight, skin tone, and fitness level were selected. Error in HR and EE was computed for each subject/device/activity combination. Devices reported the lowest error for cycling and the highest for walking. Device error was higher for males, greater body mass index, darker skin tone, and walking. Six of the devices achieved a median error for HR below 5% during cycling. No device achieved an error in EE below 20 percent. The Apple Watch achieved the lowest overall error in both HR and EE, while the Samsung Gear S2 reported the highest. In conclusion, most wrist-worn devices adequately measure HR in laboratory-based activities, but poorly estimate EE, suggesting caution in the use of EE measurements as part of health improvement programs. We propose reference standards for the validation of consumer health devices (http://precision.stanford.edu/).Entities:
Keywords: activity monitors; energy expenditure; fitness trackers; heart rate; mobile health; validation
Year: 2017 PMID: 28538708 PMCID: PMC5491979 DOI: 10.3390/jpm7020003
Source DB: PubMed Journal: J Pers Med ISSN: 2075-4426
Participant characteristics. Values are means (min–max), standard deviation (sd). Skin tone rating by Fitzpatrick scale. VO2max (maximal oxygen uptake) was either measured at incremental test to exhaustion or estimated from submaximal cycling using the Åstrand nomogram.
| Men ( | Women ( | |
|---|---|---|
| Age (years) | 40 (21–64, sd = 11.48) | 37 (23–57, sd = 9.77) |
| Body mass (kg) | 80.1 (53.9–130.6, sd = 13.25) | 61.7 (47.8–89.2, sd = 12.91) |
| Height (cm) | 179.0 (159.1–190.0, sd = 7.81) | 165.9 (154.4–184.2, sd = 7.90) |
| Body mass index (kg/m2) | 24.9 (20.7–39.3, sd = 3.46) | 22.4 (17.2–28.8, sd = 3.31) |
| Skin tone (scale 1–6) | 3.7 (1–5, sd = 1.39) | 3.7 (1–6, sd = 1.25) |
| Wrist circumference (cm) | 17.3 (16.0–21.0, sd = 1.11) | 15.4 (13.5–17.5, sd = 1.30) |
| VO2max (ml/kg/min) | 52.8 (38.2–66.6, sd = 8.48) | 45.3 (31.7–56.5, sd = 7.62) |
Figure 1Study protocol. (A) Schematic view of the protocol. Participants transition through two intensities of three modalities of exercise as shown. Walking is on a treadmill. Cycling is on a stationary bike. Activities are interspersed with brief (1 min) periods of rest “R”, and transitions between activities are indicated by “T”; (B) Data from one participant wearing four devices. Data for the error analysis is derived from the last minute of each stage. Overall, error is within an acceptable range with the exception of the walking phase for one device (green line).
Figure 2Aggregate relative error (RE) in heart rate (HR) and energy expenditure (EE) for the cycling and walking tasks—the two tasks in the protocol with overall lowest and highest median device error, respectively. Error is calculated as abs(Gold Standard−Device)/(Gold Standard). The lower boundary of the boxplots indicates the 25% quantile of data, the middle notch indicates the median data value, and the upper boundary indicates the 75% quantile. Whiskers include all data points that fall within 1.5 interquartile range (IQR) of the 25% and 75% quantile values. Data points that lie further than 1.5 IQR from the upper and lower hinge values are treated as outliers, indicated by black circles. Vertical dashed green lines indicate the 5% error threshold, while the vertical dashed yellow lines indicate the 10% error threshold. Median HR error is below the 5% threshold for all but one device for the cycling task, and below the 10% threshold for all devices on the walking task. EE error rates significantly exceed the 10% threshold for all devices on both the cycling and walking tasks.
Figure 3Median device error across activities. We defined an acceptable error range as <5% (dark blue). Light blue, white and yellow shading indicates error outside of this range. (A) Median HR beats-per-minute (bpm) error as a percent of the gold standard measurement; (B) Median EE (kcal) error as a percent of the gold standard measurement. Note the scaling of the legend color is identical in both panels. Overall, heart rate error was within the acceptable error range for the majority of task/device combinations, but EE error exceeded the allowed threshold for all tasks and devices.
Figure 4Principal component analysis of device error in (A) HR and (B) EE. Device errors across all activities (sitting, walking, running, cycling) were aggregated across subjects, excluding any subjects with missing data. The projection of the scaled error values on principal components 1 and 2 (PC2, PC2) are illustrated in the scatter plot, color-coded by device. Ellipses indicate the extent of the first and second principal components that encompass 95 percent of the subject error values for each device. Smaller ellipse area indicates lower variance among device error values, and data points near 0 along the PC1 and PC2 axes indicate low error. The Apple Watch had the most favorable overall error profile while the PulseOn had the least favorable overall error profile.