| Literature DB >> 31290128 |
Hendrika G van Lier1, Marcel E Pieterse2, Ainara Garde3, Marloes G Postel2,4, Hein A de Haan4, Miriam M R Vollenbroek-Hutten3,5, Jan Maarten Schraagen6,7, Matthijs L Noordzij2.
Abstract
Wearable physiological measurement devices for ambulatory research with novel sensing technology are introduced with ever increasing frequency, requiring fast, standardized, and rigorous validation of the physiological signals measured by these devices and their derived parameters. At present, there is a lack of consensus on a standardized protocol or framework with which to test the validity of this new technology, leading to the use of various (often unfit) methods. This study introduces a comprehensive validity assessment protocol for physiological signals (electrodermal activity and cardiovascular activity) and investigates the validity of the E4 wearable (an example of such a new device) on the three levels proposed by the protocol: (1) the signal level, with a cross-correlation; (2) the parameter level, with Bland-Altman plots; and (3) the event level, with the detection of physiological changes due to external stressor levels via event difference plots. The results of the protocol show that the E4 wearable is valid for heart rate, RMSSD, and SD at the parameter and event levels, and for the total amplitude of skin conductance responses at the event level when studying strong sustained stressors. These findings are in line with the prior literature and demonstrate the applicability of the protocol. The validity assessment protocol proposed in this study provides a comprehensive, standardized, and feasible method for assessment of the quality of physiological data coming from new wearable (sensor) technology aimed at ambulatory research.Entities:
Keywords: Electrodermal activity; Habituation effect; Heart rate; Photoplethysmogram; Skin conductance; Stress
Mesh:
Year: 2020 PMID: 31290128 PMCID: PMC7148282 DOI: 10.3758/s13428-019-01263-9
Source DB: PubMed Journal: Behav Res Methods ISSN: 1554-351X
Fig. 1Overview of the complete protocol and the decisions that can be made from it concerning the validity of a device
Fig. 2Electrocardiograph (ECG) and photoplethysmograph (PPG) signals of a participant within the present study
Fig. 9Line plot for the sing-a-song-stress task (SSST), measured with the reference device (RD) and the E4 wearable, with total amplitude of SCRs as the parameter of interest. Each thin line represents a participant, and the mean and its error bars are plotted in red. The y-scale is given as a square root, to show the difference in the lower regions and not have too much emphasis on the higher values. For the SSST, the first experimental task is a neutral baseline, the second is preparing to sing, the third is singing, and the last is recovery (another baseline) directly after singing
Fig. 10Error bar plot for the differences for each participant during the sing-a-song-stress task (SSST) and for the mean and standard error during each task (red line) for all participants, with total amplitude of SCRs as the parameter of interest (blue lines). The black line is the zero axis, and the green lines are the a priori defined boundaries (size of the reference effect). For the SSST, the first experimental task was a neutral baseline, the second was preparing to sing, the third was singing, and the last was the baseline directly after singing
Fig. 3E4 wristband—Empatica, reproduced with permission
Fig. 4Experiment design. General Stress Response indicates the sing-a-song-stress test, and Event-Based Habituation Responses shows the noise task. SQ = stress questionnaire
Fig. 5Histogram of the optimal cross-correlation found for a participant between – 8 and + 8 lags in time. The cross-correlation for each participant presented in the figure was determined on the basis of his or her most optimal lag
Fig. 6Cross-correlation plots of a participant with a high cross-correlation (.66)
Fig. 7Cross-correlation plots of a participant with a low cross-correlation (.19)
Fig. 8Bland–Altman plots for electrodermal activity, on the left, and cardiovascular activity (CVA), on the right. Each dot represents one participant. The difference between and the average of the two measures are represented on the y-axis and the x-axis, respectively. The green lines represent the a priori chosen acceptable boundaries, whereas the red lines (also marked with upper bound and lower bound) represent the actually found 95% confidence interval limits. At the bottom of each figure, the percentage of values within the proposed boundaries is given. HR, heart rate; RMSSD, root mean square of successive differences; RR/PP, the durations between successive RR (RD) or PP (E4) peaks; S-AMPL, amplitude of the skin conductance responses; SCL, skin conductance level; SCR, skin conductance responses
Fig. 11Error bar plots for each participant (thin blue lines) during the noise task, retrieved with the reference device (RD) and the E4 wearable. The mean and SE during each task for all participants, with the total amplitude of SCRs as the parameter of interest, are indicated with red lines. The y-scale is given as a logarithm, to show the difference in the lower regions and not have too much emphasis placed on the higher values. For the noise task, the same baseline is given as for the SSST, and every number represents a beep
Fig. 12Error bar plot for the differences for each participant (blue lines) during the noise task. The means and SEs are also shown (red line) during each task for all participants, with the total amplitude of SCRs as the parameter of interest. The black line is the zero axis, and the green lines are the a priori defined boundaries (size of reference effect). The noise task has the same baseline given as for the SSST, and every number represents a beep
Fig. 13Error bar plot for each participant (blue lines) during the sing-a-song-stress task, retrieved with the reference device (RD) and the E4 wearable, with heart rate as parameter of interest. The overall means and SEs are also shown (red lines). The first experimental task was a neutral baseline; the second, preparing to sing; the third, singing; and the last, a baseline directly after singing
Fig. 14Error bar plot for the differences for each participant (blue lines) during the SSST. The means and SEs are also shown (red line) during each task for all participants, with heart rate as the parameter of interest. The black line is the zero axis, and the green lines are the a priori defined boundaries (size of the reference effect). For the SSST, the first experimental task was a neutral baseline; the second, preparing to sing; the third, singing; and the last, a baseline directly after singing
Fig. 15Error bar plot for each participant (blue lines) during the noise task, retrieved with the reference device (RD) and the E4 wearable. The overall means and SEs are also shown (red lines), displaying little variation
Fig. 16Error bar plot for the differences for each participant (blue lines) during the noise task. The means and SEs are also shown (red line) for all participants, with heart rate as the parameter of interest. The black line is the zero axis, and the green lines are the a priori defined boundaries (size of the reference effect, here very close to zero). The noise task has the same baseline given as for the SSST, and every number represents a beep
For EDA, common parameters (Boucsein, - skin conductance level (SCL) - number of skin conductance responses (SCRs) - amplitude of the skin conductance responses (S-AMPL) And for CVA two types of parameters (or features) are extracted, namely from the frequency domain and from the time domain (for more information, see Berntson, Quigley, & Lozano, From the frequency domain: - (normalized) low frequency - (normalized) high frequency - ratio between low and high frequencies And for the time domain: - mean RR interval (RR interval is the time between two measured heart beats) or heart rate (HR) - standard deviation ( - root mean square of successive differences (RMSSD) of the RR interval. There are multiple variations on the RMSSD available, such as the standard deviation of the beat-to-beat or NN interval (SDNN) or the standard deviation of successive differences (SDSD). |
The EDA signal of the RD data was down sampled to 16 Hz signal, since sampling higher than 16 Hz does not add to the data quality. The wearable data was up-sampled from 4Hz to 16 Hz, to make the sampling rate similar to the RD data. Normalization was done to make two signals better comparable without losing viable information. Additionally, the data was detrended in order to make the data stationary, which is a prerequisite for a time series analysis like the cross correlation function. Time lag between -8 and +8 were considered, because a SRC can have a duration of multiple seconds. The sample frequency is 16 Hz meaning that a time lag of 16 represents one second and a time lag of 8 represents 0.5 s. To gain an overview of all found optimal cross correlations a histogram is made. From this is the most optimal cross correlation can be found for each participant. |
Same as for the EDA cross correlation Step 1 (see Box 2). Since this study aims to validate the wearable signals against a RD, the phasic activity coming from classical trough-to-peak analysis (TTP) was reported (threshold for an SCR amplitude was set at .01 μS) (Boucsein, Three parameters from the EDA data are evaluated with a Bland–Altman plot: Mean skin conductance level (SCL) The skin conductance level was based on the whole signal (start baseline – baseline after the noise task). The mean was calculated by averaging over the complete signal. Biological plausible values for SCL is between 0 and 16 μS (Braithwaite, Watson, Jones, & Rowe, The assumption of the Bland–Altman is that the differences between the wearable and the RD are normally distributed. Therefore normality of the differences needs to be assessed visually. If the data appears not normal appropriate transformations (e.g. log transformations) can be used as suggested by Boucsein ( Plot the mean of the two measurements as the abscissa ( Additionally plot the two proposed boundaries and the 95% CI of the differences in a different color. Calculate the amount of data outside the CI, as follows: |
Down-sample the RD data and up sample the wearable data from 64 Hz to a frequency of 200 Hz. The raw ECG and PPG recorded was filtered with a combination of low-pass and high-pass filters between 5 and 15 Hz (Pan & Tompkins, 1985). For each segment of data, the peaks of normal R-waves were detected using a filter-bank-based algorithm developed by Pan and Tompkins. The peaks of the P-waves were detected by finding the local optima. The durations between successive peak locations were calculated to produce RR/PP intervals. The RR/PP intervals with a length less than 0.33 s or more than 1.5 s were deleted from time series. In Fig. Three parameters from the time domain were retrieved from the RR or PP interval data: Biologically plausible values for RMSSD are between 0 and max .71 s (O’Neal et al., Same as for EDA (see Steps 4 and 5 in Box 3). |
Same as for the EDA Bland–Altman plot (Box 3, Steps 1 and 2). Total amplitude was chosen as a stress parameter, since this is a combination of both the number of SCRs and the amplitude of the SCRs. Same as for the Bland–Altman plot (see Step 4 in Box 4) In order to visualize the effect and the agreement between the two devices, multiple plots are made. 1) A line plot with the mean and the 2) A line plot with the differences between the wearable and the RD for each person are represented with a line, with the mean and the |
Same as for the Bland–Altman plot CVA (see Box 4, Steps 1 and 2). Instantaneous HR was chosen as a stress parameter, since this is an often used measure (Schubert et al., Same as for the EDA parameter (see Box 5, Steps 4 and 5). |
Different measures with the max SD difference given the boundary determined below and a sample size of earlier analysis
| Measure | Max | Boundary (10% of Plausible range) | Sample size |
|---|---|---|---|
| SCL | 0.56 μS | 1.6 μS | 53 |
| NoSCRs | 0.88/min | 2.5/min | 55 |
| TAmp | 0.21 μS | 0.6 μS | 53 |
| meanHR | 1.75 bpm | 5 bpm | 53 |
| sdHR | 0.021 s | 0.06 s | 47 |
| rmssdHR | 0.024 s | 0.07 s | 53 |
The sample size determined by for the specific power analysis is given in the last table