| Literature DB >> 35567614 |
Rachid Riad1,2,3,4,5,6,7, Marine Lunven1,2,3,4, Hadrien Titeux5,6,7, Xuan-Nga Cao5,6,7, Jennifer Hamet Bagnou1,2,3,4, Laurie Lemoine1,2,3,4, Justine Montillot1,2,3,4, Agnes Sliwinski1,2,3,4, Katia Youssov3,4, Laurent Cleret de Langavant1,2,3,4, Emmanuel Dupoux5,6,7, Anne-Catherine Bachoud-Lévi8,9,10,11.
Abstract
OBJECTIVES: Using brief samples of speech recordings, we aimed at predicting, through machine learning, the clinical performance in Huntington's Disease (HD), an inherited Neurodegenerative disease (NDD).Entities:
Keywords: Huntington’s disease; Machine learning; Speech
Mesh:
Year: 2022 PMID: 35567614 PMCID: PMC9363375 DOI: 10.1007/s00415-022-11148-1
Source DB: PubMed Journal: J Neurol ISSN: 0340-5354 Impact factor: 6.682
Fig. 1Extraction of individual clinical scores from the speech samples. (Top panel) Examples of portions of the speech signal and various types of vocalizations and segmentation are provided. Similar speech features were extracted separately from the forward and backward counting tasks yielding to 60 features (30 × 2). (Bottom panel) Illustration of the methods developments, Machine learning training and evaluation of the predictions of the clinical scores. N CAG number of CAG repeats on the Huntingtin gene, DBS Disease Burden Score. TFC Total Functional capacity, TMS Total motor score, SDMT Symbol digit modality, UHDRS IS UHDRS Independence Scale, MAE Mean absolute error, ICC Intraclass correlation coefficient, cUHDRS composite UHDRS
Demographics and clinical performance of the participants in the cohorts under study at baseline
| MIG-HD | BIOHD/REPAIRHD | Total | |
|---|---|---|---|
| Number of participants | 36 | 67 | 103 |
| Premanifest/manifest | 0/36 | 16/51 | 16/87 |
| Number of visits per patient | 1.4 (0.5) [1–2] | 1.1 (0.3) [1–2] | 1.2 (0.4) [1–2] |
| Gender | 23F/13 M | 40F/27 M | 63F/40 M |
| Age at first visit | 47.0 (9.1) [28–68] | 52.7 (11.8) [27–88] | 50.7 (11.2) [27–88] |
| Laterality | 30R/5L/1A | 59R/8L/0A | 89R/13L/1A |
| Number of CAG repeats | 45.3 (4.4) [37–60] | 43.5 (3.1) [39–55] | 44.0 (3.6) [37–60] |
| cUHDRS mean (SD) [range] | 9.1 (2.5) [5.2–15.0] | 11.1 (4.6) [2.5–18.8] | 10.4 (4.0) [2.5–18.8] |
| Total motor score mean (SD) [range] | 35.0 (13.6) [7–63] | 26.7 (20.3) [0–60] | 29.6 (18.6) [0–63] |
| TFC mean (SD) [range] | 10.4 (1.7) [6–13] | 11.0 (2.2) [5–13] | 10.8 (2.0) [5–13] |
| UHDRS independence scale mean (SD) [range] | 85.7 (8.5) [70–100] | 88.9 (12.9) [60–100] | 87.8 (11.8) [60–100] |
| Verbal fluency 1 min mean (SD) [range] | 28.2 (8.5) [9–45] | 27.6 (13.3) [9–62] | 27.8 (11.8) [9–62] |
| Symbol digit modality test mean (SD) [range] | 24.8 (7.6) [11–42] | 31.9 (15.2) [3–67] | 29.4 (13.4) [3–67] |
| Stroop word mean (SD) [range] | 61.9 (15.0) [39–99] | 70.7 (24.7) [23–117] | 67.6 (22.1) [23–117] |
| Stroop color mean (SD) [range] | 46.6 (11.9) [24–76] | 52.3 (18.5) [16–89] | 50.3 (16.7) [16–89] |
| Stroop interference mean (SD) [range] | 26.7 (8.8) [11–45] | 29.9 (12.8) [7–58] | 28.8 (11.6) [7–58] |
Mean, (Standard Deviations) [range]
F Female, M Male, R Right, L Left, A Ambidexter, TFC Total Functional Capacity
List of speech and language features extracted from the recitation of numbers
| Dimension | Speech/language feature |
|---|---|
| Articulatory and phonatory deficiencies | Total number of pronunciations errors |
| Ratio of pronunciation errors | |
| Pronunciation error per second | |
| Mean intelligibility based on non-intrusive normed speech-to-reverberation modulation energy ratio metric [ | |
| SD of the fundamental frequency | |
| Range of the fundamental frequency | |
| SD of normalized intensity of vocalizations | |
| Normalized range of intensity of vocalizations | |
| Rhythm and temporal statistics | Task duration |
| Temporal rate of the pronounced numbers | |
| Mean duration of pronounced numbers | |
| Pronounced numbers per second | |
| SD of the duration of pronounced numbers | |
| Phones per second | |
| TR of the silences | |
| Mean duration of silences | |
| SD of the duration of silences | |
| Total number of silences | |
| Sequence errors and perseverations | Levenshtein distance between the pronounced numbers and the target sequence (1, 2, …, 19, 20) |
| Gestalt similarity between the pronounced numbers and the target sequence (1, 2, …, 19, 20) | |
| Levenshtein distance between the pronounced phones and the target sequence (phones of 1, phones of 2, …, phones of 19, phones of 20) | |
| Gestalt similarity between the pronounced phones and the target sequence (phones of 1, phones of 2, …, phones of 19, phones of 20) | |
| Total number of pronounced numbers | |
| Total number of pronounced phones | |
| Collateral track additions | Total number of involuntary/abnormal vocalizations |
| Involuntary/Abnormal vocalizations per second | |
| Temporal rate of the involuntary/abnormal vocalizations | |
| Total number of filled pauses | |
| Filled pauses per second | |
| Temporal rate of the filled pauses |
SD stands for standard deviation, Temporal rate is defined as the ratio of the total time of a specific class on the total time to perform the task
Fig. 5Coefficient importance of the different speech features for the predictions of the clinical scores. Each line represents a feature of Table 2 and the rank is the order introduced in Table 2. These mean weights are obtained with a linear Elastic Net model for interpretability. The weights are z-scored per clinical score to be one the same scale. The weights for the clinical scores are reversed, so that a higher feature weight can be interpreted as a higher clinical impairment
Fig. 2Illustration of individual predictions of the cUHDRS (Left) and the TMS (Right) based on the speech features. Each individual blue dot is the difference between the predicted and the observed score for a particular assessment of an individual of the test set. The red dashed line is the line ‘y = x’. The black line is the individual contribution of a point (individual absolute error) to obtain the Mean Absolute Error (MAE)
Fig. 3Boxplots of mean-absolute-error (MAE) on the test set for the repeated-learning testing experiment. A MAE at zero means that the predicted value equals the observed one. Horizontal lines are the medians, boxes are upper and lower quartiles, and whiskers are 1.5 × IQR (Interquartile Range). First row displays the cUHDRS, functional, and motor predicted scores; whereas the second row displays the predicted Cognitive Scores. Statistical Significance was assessed with Wilcoxon-test and was Bonferroni-corrected
Fig. 4Boxplots of intraclass correlation coefficients (ICC) on the test set for the repeated-learning testing experiment. An ICC at 1 means that the predicted value equals the observed one. Horizontal lines are the medians, boxes are upper and lower quartiles, and whiskers are 1.5 × IQR (Interquartile Range). First row displays the cUHDRS, functional, and motor predicted scores; whereas the second row displays the predicted Cognitive Scores. Statistical Significance was assessed with Wilcoxon-test and was Bonferroni-corrected. The dashed lines figure the ICCs obtained between Neurologists for the clinical scores namely: (1) ICC for cUHDRS ICC = 0.92 [49], (2) for TMS ICC = 0.847 [3], (3) for TFC ICC = 0.938, and for UHDRS IS ICC = 0.842 [4]. The ICC cannot be computed for the Mean Cohort Performance as its standard deviation is zero
Summary of the speech and clinical variables with significant correlation with the Normalized Volume of the Striatum
| Pearson | Spearman | |||
|---|---|---|---|---|
| Speech | ||||
| Mean duration of the silences during backward recitation | 0.0024 | 0.57 | − 0.35 | − 0.56 |
| Standard deviation of the duration of the silences during backward recitation | 0.026 | 0.49 | − 0.41 | − 0.60 |
| Clinical variables | ||||
| cUHDRS | 0.0050 | 0.40 | 0.65 | 0.68 |
| UHDRS total motor score | 0.0090 | 0.38 | 0.52 | 0.57 |
| Stroop word | 0.021 | 0.38 | 0.61 | 0.64 |
| Symbol digit modality test | 0.030 | 0.36 | − 0.63 | − 0.63 |
| UHDRS independence scale | 0.040 | 0.33 | 0.58 | 0.57 |
The comparison between the ’s P values [46], the measure of linear relationship with the Pearson coefficient, the Spearman rank correlation coefficient , the measure of strength of the relationship with the shows that Mean duration of Silences and the Standard Deviation of the duration of Silences are as well correlated with the striatal volume than the regular clinical scores. Multiple Comparison correction is done with the Maximum Statistic [48]