| Literature DB >> 34941540 |
Sylvia Cho1, Chunhua Weng1, Michael G Kahn2, Karthik Natarajan1.
Abstract
BACKGROUND: There is a growing interest in using person-generated wearable device data for biomedical research, but there are also concerns regarding the quality of data such as missing or incorrect data. This emphasizes the importance of assessing data quality before conducting research. In order to perform data quality assessments, it is essential to define what data quality means for person-generated wearable device data by identifying the data quality dimensions.Entities:
Keywords: data accuracy; data quality; fitness trackers; patient-generated health data; qualitative research; wearable device
Mesh:
Year: 2021 PMID: 34941540 PMCID: PMC8738984 DOI: 10.2196/31618
Source DB: PubMed Journal: JMIR Mhealth Uhealth ISSN: 2291-5222 Impact factor: 4.773
Figure 1An overview of study processes and results. HIDQF: harmonized intrinsic data quality framework.
Background of respondents (N=20).
| Characteristic | Value | |
|
| ||
|
| None | 1 (5) |
|
| 1 to 3 | 12 (60) |
|
| 3 to 5 | 2 (10) |
|
| 5 to 10 | 2 (10) |
|
| 10 or more | 3 (15) |
|
| ||
|
| Device validation or reliability studies | 11 (55) |
|
| Modeling to predict health state | 10 (50) |
|
| Modeling to inform treatment decisions | 2 (10) |
|
| Tracking behavioral changes | 8 (40) |
|
| Other | 3 (15) |
|
| ||
|
| Consumer-grade wearable | 8 (40) |
|
| Research-grade wearable | 3 (15) |
|
| Used both with similar frequency | 9 (45) |
|
| ||
|
| Fitbit (Charge HR, Alta HR, Ultra, etc) | 19 (95) |
|
| Garmin (Vivofit, Vivosmart, Fenix, etc) | 6 (30) |
|
| Withings (Go, Pulse, or BP cuff) | 4 (20) |
|
| Jawbone (UP) | 2 (10) |
|
| Apple Watch | 1 (5) |
|
| Accelerometer (ActiGraph, GENEactiv, etc) | 6 (30) |
|
| Other (Huawei, Samsung gear, Misfit, etc) | 14 (70) |
Figure 2Importance ratings on dimensions from harmonized intrinsic data quality framework.
Figure 3Importance versus reliability of ratings on data quality dimensions in the harmonized intrinsic data quality framework.
List of data quality dimensions suggested based on findings from literature review and survey.
| Dimensions suggested to experts | Corresponding content from the literature review | Corresponding content from survey responses | Importance rating (only for HIDQFa) | |
|
| ||||
|
| Value conformance |
Different devices may use a different measurement unit. |
“Data set not conforming to data dictionary will be hard to fix” | 4.32 |
|
| Relational conformance | —b |
“Without relational conformance you can't link one wearable device to another or to health outcomes” | 4.11 |
|
| Computational conformance |
Companies do not always reveal whether or when they update their device algorithms or whether or when the users install the provided software updates. Lack of standardization: (for multi-device studies) different devices may use different algorithms, a different definition for the same parameter, different sampling rate. |
“I don’t know a way to proceed with the data analyses if the computational conformance isn’t met with satisfaction. it suggests that the data collected cannot be trusted.” | 4.11 |
| Completeness |
Missing data due to various reasons: device malfunction, connectivity issues, nonadherence to the device, quality of skin contact of the device. |
“Missing data is a large issue for our research, especially because we are trying to identify patterns or subsequences of activity. Missing data has to either be interpolated or treated as a zero value, and either of these methods can have a large negative effect on the results of our pattern mining techniques.” | 4.16 | |
| Breadth completeness | — |
“Lack of availability of HRVc” |
| |
|
| ||||
|
| Uniqueness plausibility | — | — | 3.56 |
|
| Atemporal plausibility | — |
“Large spikes or drops in activity that are highly inconsistent with their surrounding measured values” | 3.72 |
|
| Temporal plausibility |
Companies do not always reveal whether or when they update their device algorithms, or whether or when the users install the provided software updates. |
“Devices might cause problem with recording different time zone or time during traveling: Subjects may travel between different time zones during study period. Some devices don't recognize a different time zone and the recorded data has weird time pattern that is hard to understand” | 4.11 |
| Temporal data granularity |
Fitbit only provides access to day-level data unless the minute-level or second-level data is requested and approved. |
“Access to minute level data.” | — | |
| Accuracy |
Poor data accuracy caused by device malfunction, unknown limitations of proprietary algorithms, user error in device use. |
“Other activities generating step counts (eg, motorcycle ride, vibration)” “Inaccurate sleep and wake time recognition” | — | |
| Interpretability | — |
“Trying to nail down exactly what a participant was doing when data was being collected offsite.” | — | |
aHIDQF: harmonized intrinsic data quality framework.
bNo available data.
cHRV: heart rate variability.
Figure 4Data quality dimensions for assessing person-generated wearable device data for research purposes. HIDQF: harmonized intrinsic data quality framework.
Data quality dimensions for person-generated wearable device data identified by domain experts.
| Type and dimension | Definitiona | Example | ||||
|
| ||||||
|
|
| |||||
|
|
| Value conformance | Data values conform to internal formatting constraints, allowable values, or ranges. |
Unit of distance is “miles.” “Sleep stages” only has values “deep,” “light,” “rem,” and “wake,” which conform to the data dictionary. | ||
|
|
| Relational conformance | Assuming there are multiple tables or files, recorded data elements agree with structural constraints imposed by the physical database structures that store data values. |
Participant ID number links to other tables as required. The wearable device identifier is appropriately linked for all observations. | ||
|
|
| Computational conformance | Computations used to create derived values from existing variables yield the intended results either within a data set or between data sets. |
Sleep duration conforms to the difference between start time and end time of sleep. | ||
|
| Completeness: Are data values present? | Missing data is determined based on the presence of data. Typically, absence of data is expected if the device is not worn, but this could sometimes be difficult to know retrospectively. |
There is no NA (Not Available) in the step count data. | |||
|
|
| |||||
|
|
| Uniqueness plausibility | Objects do not appear multiple times in settings where they should not be duplicated or cannot be distinguished within a database or when compared with an external reference. |
A single participant only has one participant ID number. | ||
|
|
| Atemporal plausibility | Observed data values, distributions, or densities agree with local or “common” knowledge or from comparisons with external sources that are deemed to be trusted or relative gold standards. |
Step count and distance values are positive. Trends of step counts and distance agree with each other. Step counts do not show a sudden spike during sleep or during sedentary time. The range of heart rate values is biologically plausible. Heart rate is higher when active compared with when sedentary. | ||
|
|
| Temporal plausibility | Time-varying variables change values as expected based on known temporal properties or across one or more external comparators or gold standards. |
Start time of sleep occurs before end time of sleep. Aggregate step count is higher during daytime than nighttime. | ||
|
| ||||||
|
|
| |||||
|
|
| Breadth completeness | All data types required for intended use exist. |
Heart rate data are essential for studies analyzing the relationship between physical activity and heart rate. | ||
|
|
| Density completeness | Data set contains a specified number of data values or occurs regularly over a certain period. |
Heart rate should be measured at least once a day. Sleep data should be recorded every day consecutively for a 6-week period to be considered complete. | ||
|
| Temporal data granularity: does the device collect data granular enough for intended use? | Granularity of time stamps are sufficient for the task at hand. |
Data values are recorded every second, which is appropriate for marathon research studies (the exact start and end time of the marathon for each runner is important for marathon-related studies). | |||
aDefinitions were adopted and adapted from the studies by Weiskopf et al [25] and Kahn et al [14].