| Literature DB >> 27703257 |
L Giancardo1, A Sánchez-Ferro1,2,3,4,5, T Arroyo-Gallego1,6, I Butterworth1, C S Mendoza1, P Montero7, M Matarazzo2,3,4,5, J A Obeso2,3,4, M L Gray1,8, R San José Estépar9.
Abstract
Parkinson's disease (PD) is a slowly progressing neurodegenerative disease with early manifestation of motor signs. Objective measurements of motor signs are of vital importance for diagnosing, monitoring and developing disease modifying therapies, particularly for the early stages of the disease when putative neuroprotective treatments could stop neurodegeneration. Current medical practice has limited tools to routinely monitor PD motor signs with enough frequency and without undue burden for patients and the healthcare system. In this paper, we present data indicating that the routine interaction with computer keyboards can be used to detect motor signs in the early stages of PD. We explore a solution that measures the key hold times (the time required to press and release a key) during the normal use of a computer without any change in hardware and converts it to a PD motor index. This is achieved by the automatic discovery of patterns in the time series of key hold times using an ensemble regression algorithm. This new approach discriminated early PD groups from controls with an AUC = 0.81 (n = 42/43; mean age = 59.0/60.1; women = 43%/60%;PD/controls). The performance was comparable or better than two other quantitative motor performance tests used clinically: alternating finger tapping (AUC = 0.75) and single key tapping (AUC = 0.61).Entities:
Mesh:
Year: 2016 PMID: 27703257 PMCID: PMC5050498 DOI: 10.1038/srep34468
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Pipeline of the algorithm to generate the neuroQWERTY score (nQi) from the hold time (HT) series.
(1) The HT time events are split by non overlapping 90 seconds windows to create the B sets. (2) From each independent B set, a 7-element feature vector, x is computed: 3 features that represent HT variance, and 4 features that represent a histogram of HT values. Any B sets with fewer than 30 HT values were ignored. (3) For each feature vector, x, a single numerical score, nQi, is generated using an ensemble regression approach. Each unit in the ensemble regression includes a linear Support Vector Regression step trained on the Unified Parkinson’s disease rating scale part III (UPDRS-III), the clinical score for evaluating PD motor symptoms. The Support Vector parameter estimation was done using a separate data set. A cross validation strategy with two data sets (de novo PD and early PD) was employed (see Fig. 4). (4) For the analyses herein, an average nQi score was computed for each subject tested. More information can be found in the Methods.
Figure 2Discriminative performance of nQi.
For each subject, an average nQi score was computed (as illustrated in Fig. 1) from the hold time series measured during the typing task. Box plots visualize first, third quartiles and medians; the ends of the whiskers represent the lowest (or highest) value still within 1.5-times the interquartile range. (a) Group level comparison between PD and controls with the combined dataset between all 43 controls and 42 PD subjects. The control group is significantly different from the PD group (p = 0.001). (b) Group level comparison between controls, de-novo PD subjects (recently diagnosed with PD and never taken PD medications; average time since diagnosis 1.6 years) and early PD subjects (average time since diagnosis 3.9 years; on PD medication, but no medication for the 18 hours before the typing test). Both PD sub-groups are significantly different from the controls group (de-novo/controls p = 0.022, early PD/controls p = 0.003). The statistical significance of the discriminative performance is computed with a logistic regression model including sex, age, years of education and typing skills as co-variates (see suppl. material Fig. S.3).
Figure 3Comparison of receivers operating characteristic (ROC) curves showing the classification performance of nQi (main contribution of this paper), alternating finger tapping and single key tapping on the combined dataset of 42 PD subjects and 43 controls.
The shadowed areas represent the 95% confidence intervals. In the legend, the area under the ROC curve (AUC) and the 95% confidence intervals and are shown (see Table 1 for more details). The nQi score shows the best performance in comparison with alternating finger tapping (p < 0.001) and single key tapping (p < 0.001). Alternating finger tapping and single key tapping are two quantitative measurements commonly used to evaluate motor impairment in PD studies. In our cohort, the former showed better performance than the latter (p = 0.008). The p-values have been computed with the DeLong’s test for correlated ROC curves, which test the null hypothesis that the AUCs of two ROC curves are statistically the same.
Summary of statistical tests for performance of nQi (main contribution of this paper), alternating finger tapping, single key tapping and typing speed on the combined dataset of 42 PD subjects and 43 controls.
| Parkinson’s | Controls | StatisticalSignificance (unadjusted) | StatisticalSignificance (adjusted) | |
|---|---|---|---|---|
| 42 | 43 | |||
| Avg. UPDRS-III (std) | 20.6 (7.7) | 1.9 (1.8) | ***(p < 0.001) | N/A |
| Avg. nQi (std) | 0.130 (0.085) | 0.060 (0.057) | ***(p < 0.001) | ***(p = 0.001) |
| Avg. Alternating Finger Tapping (std) | 95.37 (22.01) | 128.37 (28.85) | ***(p < 0.001) | ***(p < 0.001) |
| Avg. Single Key Tapping (std) | 162.88 (24.09) | 170.85 (16.45) | not sig. (p = 0.08) | *(p = 0.035) |
The unadjusted statistical significance is computed with two-sided Mann-Whitney U test.
aAn additional co-variate, typing speed, was added for nQi. nQi and alternating finger tapping are the two tests that show a consistent statistical significance difference between PD subjects and controls. In the adjusted model for nQi, none of the co-variates reached statistical significance (see supplementary materials). For completeness, we also show the UPDRS-III scores. In our datasets, only subjects with confirmed clinical PD or lack thereof were included. Therefore, UPDRS-III, which is based on clinical evaluations, can discriminate PD subjects from controls perfectly.
bThe adjusted significance tests were computed with logistic regression models including sex, age and years of education as co-variates.
nQi discrimination performance with different cut-off points in the combined dataset.
| Misclassification cost (cost per FN/cost per FP) | Estimated cut-off point | Sensitivity | Specificity | Accuracy | TP | FN | TN | FP |
|---|---|---|---|---|---|---|---|---|
| 1/1 | 0.078 | 0.71 | 0.84 | 0.78 | 30 | 12 | 36 | 7 |
| 2/1 | 0.075 | 0.74 | 0.79 | 0.76 | 31 | 11 | 34 | 9 |
| 1/2 | 0.105 | 0.55 | 0.95 | 0.75 | 23 | 19 | 41 | 2 |
TP: true positives, FN: false negatives, TN: true negatives, FP: false positives. The cut-off points have been automatically estimated by maximizing the generalized Youden Index35 under three different misclassification costs assumptions: the cost for FN and FP is equal, the FN misclassification cost is twice the one for FP and that the FP misclassification cost is twice the one for FN.
The combined dataset comprises two independent data sets: De-novo dataset and Early-PD dataset.
| Parkinson’s | Controls | StatisticalSignificance | |
|---|---|---|---|
| | 42 | 43 | |
| Avg. Disease onset, years (std) | 2.58 (1.67) | ||
| Women # (%) | 18 (43%) | 26 (60%) | not sig. (p = 0.11) |
| Men # (%) | 24 (57%) | 17 (40%) | not sig. (p = 0.11) |
| Avg. Age (std) | 59.0 (9.8) | 60.1 (10.2) | not sig. (p = 0.53) |
| Avg. Years of Education (std) | 15.2 (4.1) | 15.3 (5.2) | not sig. (p = 0.98) |
| Avg. Typing Speed (std) | 97.91 (43.48) | 112.3 (58.7) | not sig. (p = 0.35) |
| | 24 | 30 | |
| Avg. Disease onset, years (std) | 1.60 (1.22) | ||
| Women # (%) | 10 (42%) | 16 (53%) | not sig. (p = 0.40) |
| Men # (%) | 14 (58%) | 14 (47%) | not sig. (p = 0.40) |
| Avg. Age (std) | 61.4 (10.5) | 61.8 (10.5) | not sig. (p = 0.68) |
| Avg. Years of Education (std) | 15.5 (3.8) | 14.9 (5.1) | not sig. (p = 0.55) |
| Avg. Typing Speed (std) | 97.2 (42.5) | 110.3 (59.5) | not sig. (p = 0.51) |
| | 18 | 13 | |
| Avg. Disease onset, years (std) | 3.89 (1.23) | ||
| Women # (%) | 8 (44%) | 10 (77%) | not sig. (p = 0.08) |
| Men # (%) | 10 (56%) | 3 (23%) | not sig. (p = 0.08) |
| Avg. Age (std) | 55.9 (8.0) | 56.1 (8.6) | not sig. (p = 0.95) |
| Avg. Years of Education (std) | 14.83 (4.6) | 16.2 (5.4) | not sig. (p = 0.37) |
| Avg. Typing Speed (std) | 98.9 (45.9) | 117.0 (59.2) | not sig. (p = 0.48) |
The typing speed is computed from the dataset as the average number of keys pressed in a minute. With the exception of PD-specific scores (disease onset), the attributes of the control and PD subjects are statistically similar (using the two-sided Mann-Whitney U test), suggesting the populations are reasonably well matched. (The gender may be a confound for the Early-PD dataset by itself).
Cross validation performance for nQi in the Early-PD and De-novo datasets.
| Test dataset | Train dataset | Area Under the ROC curve (AUC) |
|---|---|---|
| Early-PD | De-novo | 0.92 |
| De-novo | Early-PD | 0.77 |
| Combined | 0.81 |
The nQi scores for the combined dataset are generated by combining the output of the cross-validation of Early-PD and De-novo dataset, without any further training. The AUC scores allow a reliable comparison of the performance even when the number of PD and Controls is not fully balanced as in the Early-PD and De-novo datasets evaluated independently.
Figure 4Cross validation strategy.
The nQi scores for the Early-PD dataset are generated by training our ensemble regression model (see Methods) on the De-novo dataset, while the nQi scores for the De-novo dataset are generated by training on the Early-PD dataset. The nQi scores for the combined dataset are generated by combining the previous outputs without any further training.