| Literature DB >> 32529042 |
Christoph M Kanzler1, Mike D Rinderknecht1, Anne Schwarz2,3, Ilse Lamers4,5, Cynthia Gagnon6, Jeremia P O Held2,3, Peter Feys4, Andreas R Luft2,3, Roger Gassert1, Olivier Lambercy1.
Abstract
Digital health metrics promise to advance the understanding of impaired body functions, for example in neurological disorders. However, their clinical integration is challenged by an insufficient validation of the many existing and often abstract metrics. Here, we propose a data-driven framework to select and validate a clinically relevant core set of digital health metrics extracted from a technology-aided assessment. As an exemplary use-case, the framework is applied to the Virtual Peg Insertion Test (VPIT), a technology-aided assessment of upper limb sensorimotor impairments. The framework builds on a use-case-specific pathophysiological motivation of metrics, models demographic confounds, and evaluates the most important clinimetric properties (discriminant validity, structural validity, reliability, measurement error, learning effects). Applied to 77 metrics of the VPIT collected from 120 neurologically intact and 89 affected individuals, the framework allowed selecting 10 clinically relevant core metrics. These assessed the severity of multiple sensorimotor impairments in a valid, reliable, and informative manner. These metrics provided added clinical value by detecting impairments in neurological subjects that did not show any deficits according to conventional scales, and by covering sensorimotor impairments of the arm and hand with a single assessment. The proposed framework provides a transparent, step-by-step selection procedure based on clinically relevant evidence. This creates an interesting alternative to established selection algorithms that optimize mathematical loss functions and are not always intuitive to retrace. This could help addressing the insufficient clinical integration of digital health metrics. For the VPIT, it allowed establishing validated core metrics, paving the way for their integration into neurorehabilitation trials.Entities:
Keywords: Diagnostic markers; Multiple sclerosis; Neurological disorders; Predictive markers; Prognostic markers
Year: 2020 PMID: 32529042 PMCID: PMC7260375 DOI: 10.1038/s41746-020-0286-7
Source DB: PubMed Journal: NPJ Digit Med ISSN: 2398-6352
Fig. 1Overview of the metric selection framework and the Virtual Peg Insertion Test (VPIT).
a The frameworks allows to select a core set of validated digital health metrics through a transparent step-by-step selection procedure. Model quality criteria C1 and C2; ROC receiver operating characteristics, AUC area under curve, ICC intra-class correlation, SRD% smallest real difference; η strength of learning effects. b The framework was applied to data recorded with the VPIT, a sensor-based upper limb sensorimotor assessment requiring the coordination of arm and hand movements as well as grip forces.
Results for the data-driven selection of kinematic metrics.
| Movement characteristic | Sensor-based metric | Validity: AUC | Reliability: ICC | Error: SRD% | Learning: |
|---|---|---|---|---|---|
| Mov. smoothness TP | Jerk TP | 0.80 | 0.69 | 23.10 | −4.41 |
| Log jerk TPa | 0.78 | 0.74 | 26.11 | −4.82 | |
| SPARC TPb | 0.84 | 0.83 | 23.78 | −7.16 | |
| Num of velocity peaks TPb | 0.82 | 0.79 | 21.30 | −6.36 | |
| Distance to max. velocity TPb | 0.44 | 0.74 | 33.64 | 2.42 | |
| Time to max. velocity TPb | 0.45 | 0.78 | 28.70 | 3.93 | |
| Mov. smoothness RT | Jerk RT | 0.84 | 0.68 | 20.83 | −4.70 |
| Log jerk RTa | 0.73 | 0.75 | 25.33 | −6.08 | |
| SPARC RTa | 0.71 | 0.76 | 28.93 | −1.57 | |
| Num. velocity peaks RTa,b | 0.76 | 0.70 | 23.27 | −3.28 | |
| Distance to max. velocity RT | 0.43 | 0.65 | 41.39 | 3.67 | |
| Time to max. velocity RT | 0.48 | 0.73 | 33.99 | 2.43 | |
| Mov. efficiency TP | Path length ratio TPa | 0.89 | 0.76 | 24.24 | −2.17 |
| Throughput TPb | 0.92 | 0.81 | 24.07 | −12.18 | |
| Mov. efficiency RT | Path length ratio RTa | 0.83 | 0.79 | 17.30 | −3.61 |
| Throughput RT | 0.90 | 0.78 | 27.43 | −13.21 | |
| Mov. curvature TP | Trajectory error mean TP | 0.55 | 0.86 | 17.14 | −0.60 |
| Trajectory error max. TP | 0.57 | 0.86 | 15.84 | −0.37 | |
| Initial mov. angle TP | 0.67 | 0.90 | 13.56 | −1.50 | |
| Initial mov. angle TP | 0.67 | 0.90 | 13.29 | −1.52 | |
| Initial mov. angle TP | 0.61 | 0.88 | 14.37 | −2.06 | |
| Mov. curvature RT | Trajectory error mean RT | 0.56 | 0.84 | 20.00 | 1.24 |
| Trajectory error max. RT | 0.55 | 0.84 | 18.58 | 1.22 | |
| Initial mov. angle RT | 0.51 | 0.75 | 33.90 | 3.18 | |
| Initial mov. angle RT | 0.51 | 0.71 | 28.65 | 2.92 | |
| Initial mov. angle RT | 0.60 | 0.79 | 23.99 | 1.53 | |
| Mov. speed TP | Velocity mean TP | 0.83 | 0.88 | 20.61 | −9.99 |
| Velocity max. TP | 0.83 | 0.87 | 18.57 | −9.14 | |
| Mov. speed RT | Velocity mean RT | 0.75 | 0.87 | 19.01 | −7.60 |
| Velocity max. RTa | 0.76 | 0.86 | 19.41 | −6.27 | |
| Endpoint error peg approach | Position error peg approach | 0.86 | 0.64 | 29.54 | −4.66 |
| Jerk peg approacha | 0.74 | 0.72 | 27.65 | −2.94 | |
| Log jerk peg approach | 0.69 | 0.75 | 30.20 | −8.36 | |
| SPARC peg approach | 0.78 | 0.64 | 46.55 | −10.29 | |
| Endpoint error hole approach | Position error hole approach | 0.94 | 0.76 | 31.29 | −5.36 |
| Jerk hole approach | 0.57 | 0.68 | 30.63 | −4.84 | |
| Log jerk hole approach | 0.66 | 0.83 | 23.25 | −6.53 | |
| SPARC hole approacha | 0.86 | 0.81 | 24.81 | −5.72 | |
| Haptic collisions TP | Haptic collisions mean TP | 0.61 | 0.85 | 24.55 | −3.99 |
| Haptic collisions max. TP | 0.63 | 0.84 | 20.54 | −1.08 | |
| Haptic collisions RT | Haptic collisions mean RT | 0.61 | 0.72 | 25.32 | −0.07 |
| Haptic collisions max. RTb | 0.46 | 0.79 | 27.02 | 4.37 | |
| Number of movements | Number of mov. onsets | 0.22 | 0.22 | 61.34 | −0.82 |
| Number of mov. ends | 0.09 | 0.29 | 57.01 | 0.00 | |
| Object drops | Number of dropped pegs | 0.65 | 0.50 | 41.11 | −3.20 |
The area under the curve (AUC, optimum at 1), intraclass correlation coefficient (ICC, optimum at 1), the smallest real difference (SRD%, optimum at 0), and η value (optimum at 0, worst at −∞) were used to describe discriminative validity, test–retest reliability, measurement error, and learning effects, respectively.
mov movement, TP transport, RT return, SPARC spectral arc length, num number.
aMetric fulfilled all evaluation criteria (AUC > 0.7, ICC > 0.7, SRD% = −6.35).
bInsufficient model quality according to selection step 1.
Results for the data-driven selection of kinetic metrics.
| Movement characteristic | Sensor-based metric | Validity: AUC | Reliability: ICC | Error: SRD% | Learning: |
|---|---|---|---|---|---|
| GF scaling TP | GF mean TP | 0.40 | 0.84 | 14.46 | 0.39 |
| GF max. TP | 0.40 | 0.86 | 15.19 | 0.07 | |
| GF rate mean TP | 0.25 | 0.87 | 12.14 | 2.07 | |
| GF rate max. TP | 0.25 | 0.79 | 20.53 | 3.93 | |
| GF scaling RT | GF mean RT | 0.49 | 0.76 | 27.62 | 0.17 |
| GF max. RT | 0.45 | 0.66 | 37.61 | 2.80 | |
| GF rate mean RT | 0.07 | 0.82 | 27.79 | 5.87 | |
| GF rate max. RT | 0.29 | 0.48 | 34.05 | 7.19 | |
| GF scaling peg approach | GF mean peg approach | 0.45 | 0.83 | 18.09 | 1.10 |
| GF max. peg approach | 0.39 | 0.84 | 19.40 | −0.72 | |
| GF rate mean peg approach | 0.18 | 0.88 | 14.76 | 3.54 | |
| GF rate max. peg approach | 0.32 | 0.84 | 19.52 | 0.74 | |
| GF scaling hole approach | GF mean hole approach | 0.36 | 0.81 | 15.34 | 0.76 |
| GF max. hole approach | 0.37 | 0.82 | 16.43 | 0.50 | |
| GF rate mean hole approach | 0.15 | 0.82 | 14.18 | 2.73 | |
| GF rate max. hole approach | 0.28 | 0.77 | 21.41 | 1.82 | |
| GF coord. TP | GF rate num. peaks TPa | 0.74 | 0.81 | 20.59 | −6.11 |
| GF rate SPARC TPa | 0.74 | 0.82 | 22.48 | −5.71 | |
| GF coord. RT | GF rate num. peaks RT | 0.60 | 0.83 | 20.17 | −4.16 |
| GF rate SPARC RT | 0.64 | 0.78 | 23.81 | −6.35 | |
| GF coord. peg approach | GF rate num. peaks peg approach | 0.90 | 0.78 | 25.60 | −12.25 |
| GF rate SPARC peg approach | 0.90 | 0.83 | 22.99 | −8.19 | |
| GF coord. hole approach | GF rate num. peaks hole approacha | 0.91 | 0.81 | 24.29 | −6.14 |
| GF rate SPARC hole approacha | 0.84 | 0.82 | 26.38 | −5.94 | |
| GF coord. buildup | GF rate num. peaks buildupb | 0.15 | 0.44 | 57.70 | 0.77 |
| GF rate SPARC buildupb | 0.56 | 0.79 | 28.62 | −3.22 | |
| GF buildup duration | 0.70 | 0.82 | 21.36 | −6.97 | |
| GF coord. release | GF rate num. peaks releaseb | 0.44 | 0.48 | 56.80 | 1.78 |
| GF rate SPARC release | 0.91 | 0.86 | 18.63 | −6.78 | |
| GF release duration | 0.67 | 0.81 | 21.63 | −2.78 | |
| Overall disability | Task completion time | 0.91 | 0.78 | 26.16 | −11.34 |
| Simulated Gaussian noiseb | 0.37 | −0.07 | 117.04 | 0.25 |
The area under the curve (AUC, optimum at 1), intraclass correlation coefficient (ICC, optimum at 1), the smallest real difference (SRD%, optimum at 0), and η value (optimum at 0, worst at −∞) were used to describe discriminative validity, test–retest reliability, measurement error, and learning effects, respectively. The task completion time and the simulated Gaussian noise metrics were evaluated in addition to the kinetic metrics.
GF grip force, TP transport, RT return, SPARC spectral arc length, num number.
aMetric fulfilled all evaluation criteria (AUC > 0.7, ICC > 0.7, SRD% = −6.35).
binsufficient model quality according to selection step 1.
Demographics and clinical characteristics of the study population.
| Characteristics | Unit | Neurologically intact | Stroke | Multiple sclerosis | ARSACS |
|---|---|---|---|---|---|
| 120 | 53 | 28 | 8 | ||
| Age | years | 51.1 [34.6, 65.6] | 59.0 [52.0, 69.0] | 54.5 [39.0, 63.0] | 37.0 [30.0 48.5] |
| Gender | m/f | 60/60 | 37/16 | 12/16 | 4/4 |
| FMA-UE | 0–66 | – | 57 [49, 65] | – | |
| ARAT | 0–57 | – | – | 52.0 [46.5, 56.0] | – |
| NHPT | s | – | – | 43.5 [33.1, 58.7] |
Values reported as median [25th, 75th percentile].
ARSACS autosomal recessive spastic ataxia of Charlevoix-Saguenay, FMA-UE Fugl-Meyer assessment for the upper extremity, ARAT action research arm test, NHPT nine hole peg test.
Fig. 2Data-driven selection and validation of metrics: example of task completion time.
a The influence of age, sex, tested body side, handedness, and stereo vision deficits on each digital health metrics was removed using data from neurologically intact subjects and mixed effect models (model quality criteria C1 and C2). Models were fitted in a Box–Cox-transformed space and back-transformed for visualization. Metrics with low model quality (C1 > 15% or C2 > 25%) were removed. b The ability of a metric to discriminate between neurologically intact and affected subjects (discriminant validity) was evaluated using the area under the curve value (AUC). Metrics with AUC < 0.7 were removed. c Test–retest reliability was evaluated using the intra-class correlation coefficient (ICC) indicating the ability of a metric to discriminate between subjects across testing days. Metrics with ICC < 0.7 were removed. Additionally, metrics with strong learning effects (η > −6.35) were removed. The long horizontal red line indicates the median, whereas the short ones represent the 25th and 75th percentile. d Measurement error was defined using the smallest real difference (SRD%), indicating a range of values for that the assessment cannot discriminate between measurement error and physiological changes. The distribution of the intra-subject variability was visualized, as it strongly influences the SRD. Metrics with SRD% > 30.3 were removed.
Fig. 3Partial correlation analysis.
The objective was to remove redundant information. Therefore, partial Spearman correlations were calculated between all combination of metrics while controlling for the potential influence of all other metrics. Pairs of metrics were considered for removal if the correlation was equal or above 0.5 The process was done in an iterative manner and the first a and the last b iterations are presented.
Structural validity: exploratory factor analysis.
| Expected interpretation | Sensor-based metric | F1 | F2 | F3 | F4 | F5 |
|---|---|---|---|---|---|---|
| Movement smoothness transport | Log jerk transport | 0.09 | 0.73a | 0.21 | −0.19 | −0.05 |
| Movement smoothness return | Log jerk return | −0.08 | 0.86a | −0.11 | 0.02 | 0.02 |
| SPARC return | 0.10 | 0.59a | −0.10 | 0.23 | −0.03 | |
| Movement efficiency transport | Path length ratio transport | 0.83a | 0.08 | −0.17 | 0.06 | 0.11 |
| Movement efficiency return | Path length ratio return | 0.79a | −0.06 | 0.08 | −0.14 | 0.04 |
| Movement speed transport | Velocity max. return | −0.02 | 0.01 | 0.16 | 0.90a | 0.01 |
| Endpoint error peg approach | Jerk peg approach | 0.72a | −0.04 | 0.12 | 0.07 | −0.14 |
| GF coord. transport | GF num. peaks transport | 0.00 | −0.06 | 0.93a | 0.11 | −0.03 |
| GF rate SPARC transport | −0.08 | 0.19 | 0.62a | 0.00 | 0.11 | |
| GF coord. hole approach | GF rate SPARC hole approach | 0.11 | −0.02 | 0.02 | 0.01 | 0.94a |
Loadings of metrics on underlying latent factors extracted with exploratory factor analysis. The interpretation of each metric was physiologically motivated initially. Larger absolute loadings indicate a stronger contribution to a factor.
F1–5 data-driven latent factors, GF grip force, coord coordination, num number, SPARC spectral arc length.
aIndicates strong loadings (i.e., absolute loading of at least 0.5).
Fig. 4Sensitivity of metrics to disability severity in stroke subjects.
Subjects were grouped according to the clinical disability level. The vertical axis indicates task performance based on the distance to the reference population. The population median is visualized through the black horizontal line, the interquartile range (IQR) through the boxes, and the min and max value within 1.5 IQR of the lower and upper quartiles, respectively, through the whiskers. Data points above the 95th-percentile (triangles) of neurologically intact subjects are showing abnormal behavior (black dots). Solid and dashed horizontal black lines above the box plots indicate results of the omnibus and post-hoc statistical tests, respectively. *Indicates p < 0.05 and **p < 0.001. n refers to the number of subjects in that group and N to the number of data points. Only subjects with available clinical scores were included. For the jerk peg approach, one outlier was not visualized to maintain a meaningful representation. FMA-UE Fugl-Meyer upper extremity, SPARC spectral arc length.
Fig. 6Sensitivity of metrics to disability severity in ARSACS subjects.
See Fig. 4 for a detailed description.
Fig. 5Sensitivity of metrics to disability severity in MS subjects.
See Fig. 4 for a detailed description. ARAT action research arm test.