| Literature DB >> 31934870 |
Yasunori Yamada1, Kaoru Shinkawa1, Keita Shimmei1,2.
Abstract
BACKGROUND: Identifying signs of Alzheimer disease (AD) through longitudinal and passive monitoring techniques has become increasingly important. Previous studies have succeeded in quantifying language dysfunctions and identifying AD from speech data collected during neuropsychological tests. However, whether and how we can quantify language dysfunction in daily conversation remains unexplored.Entities:
Keywords: Alzheimer disease; behavioral marker; daily conversation; dementia; monitoring; screening; speech analysis
Year: 2020 PMID: 31934870 PMCID: PMC6996758 DOI: 10.2196/16790
Source DB: PubMed Journal: JMIR Ment Health ISSN: 2368-7959
Figure 1Atypical repetition in conversational data from regular monitoring service. A. Overview of regular monitoring service. The manually transcribed text of the conversation was analyzed in this study. B. Schematic illustrating the paired samples of conversations separated by t days and n phone calls to extract repetition features across different conversations. C. Repetition features of seniors with and without AD. Violin plot is used to visualize the distribution of the data and its probability density. On each side of the violin is a kernel density estimation to show the distribution shape of the data. The wider portion of the violin indicates the higher density and the narrow region represents relatively lower density. The grey box with the whiskers in the violin is the boxplot. The box denotes the 25th (Q1) and 75th (Q3) percentiles. The whiskers denote the upper and lower adjacent values that are the most extreme within Q3+1.5(Q3-Q1) and Q1-1.5(Q3-Q1), respectively. The white dot in the box represents median value. Significant differences are denoted with asterisks (*P<.001). D. AUC-ROC score of topic repetition feature with different T days.
Demographics of conversational data participants.
| Status and gender | Age (years) | Duration (months) | Calls, n | Call time (min), mean (SD) | Number of characters, mean (SD) | |
|
|
|
|
|
|
| |
|
| F | 61-62 | 11 | 31 | 7.8 (2.0) | 422.7 (121.4) |
|
| F | 63-64 | 14 | 32 | 14.0 (5.1) | 1078.3 (369.8) |
|
| F | 75-77 | 25 | 75 | 11.3 (8.5) | 744.8 (215.5) |
|
| F | 75-75 | 1 | 4 | 12.8 (3.1) | 1365.0 (130.6) |
|
| F | 78-78 | 9 | 64 | 10.3 (2.0) | 815.2 (173.8) |
|
| F | 79-80 | 6 | 23 | 11.7 (3.2) | 1507.5 (457.4) |
|
| F | 80-83 | 33 | 109 | 16.6 (4.7) | 1499.7 (391.7) |
|
| F | 82-83 | 19 | 72 | 6.8 (3.0) | 811.2 (367.9) |
|
| F | 88-89 | 17 | 104 | 11.2 (4.4) | 944.5 (314.2) |
|
| F | 91-91 | 9 | 35 | 22.2 (3.7) | 2405.8 (433.2) |
|
| M | 63-65 | 16 | 72 | 11.9 (3.1) | 1146.8 (242.9) |
|
| M | 67-70 | 33 | 132 | 10.6 (2.3) | 999.4 (236.5) |
|
| M | 82-85 | 30 | 226 | 17.7 (6.3) | 1207.0 (497.7) |
|
|
|
|
|
|
| |
|
| F | 83-84 | 14 | 40 | 9.5 (2.8) | 923.9 (409.2) |
|
| F | 85-85 | 5 | 13 | 7.8 (1.8) | 662.0 (199.1) |
| Mean (SD) | 76.8 (9.4) | 16.1 (10.2) | 68.8 (57.2) | 12.1 (4.1) | 1098.0 (500.3) | |
| Total | — | — | 1032 | 13,230 (221 hours) | 1,132,935 | |
Figure 2The equation of the probability of biterm bi.
Top 15 features of high discriminative power among linguistic features used in previous studies and repetition features. The table contains area under the receiver operating characteristic curve (AUC-ROC) score, effect size (Cohen d) with 95% CI, and P value of 2-sided t test with Bonferroni multiple testing correction.
| Feature type | AUC-ROCa | Effect size (95% CI) | Adjusted |
| Topic repetition in two different conversations separated by | 0.91 | –1.76 (–2.15 to –1.36) | 4.17E–17 |
| Word repetition in two different conversations separated by | 0.90 | –1.67 (–2.06 to –1.29) | 4.44E–17 |
| Conjunction ratio | 0.89 | –2.03 (–2.33 to –1.73) | 2.04E–41 |
| Pronoun to noun ratio | 0.87 | –1.94 (–2.24 to –1.64) | 3.17E–38 |
| Topic repetition in two different conversations separated by | 0.86 | –1.35 (–1.51 to –1.20) | 2.75E–67 |
| Noun ratio | 0.86 | 1.38 (1.09 to 1.67) | 4.28E–20 |
| Word repetition in two different conversations separated by | 0.84 | –1.22 (–1.36 to –1.08) | 9.35E–66 |
| Pronoun ratio | 0.82 | –1.50 (–1.79 to –1.21) | 1.30E–23 |
| Noun to verb ratio | 0.81 | 0.63 (0.35 to 0.91) | 2.73E–04 |
| Honoré’s statistic | 0.80 | 1.03 (0.75 to 1.32) | 1.61E–11 |
| Word repetition in single conversationsb | 0.79 | –1.08 (–1.36 to –0.79) | 1.57E–12 |
| Topic repetition in single conversationsb | 0.78 | –0.80 (–1.09 to –0.52) | 7.62E–07 |
| Conjunction frequency | 0.77 | –1.27 (–1.55 to –0.98) | 4.22E–17 |
| Noun frequency | 0.75 | 0.71 (0.43 to 0.99) | 1.66E–05 |
| Adjective ratio | 0.75 | –1.06 (–1.34 to –0.77) | 5.12E–12 |
aAUC-ROC: area under the receiver operating characteristic curve.
bTopic and word repetition features proposed by the authors.
Comparison of the results of statistical analysis for the linguistic features between our study and previous studies. Our study analyzes speech data during daily conversations, while the previous studies analyzed connected speech data during neuropsychological tests. Sig and nonsig refer to significant and nonsignificant. For example, sig-nonsig in the inconsistent column indicates a feature that showed significant difference in the previous studies but not in our study. Cells contain the name of the corresponding features. Features whose statistical test results were not reported in the previous studies are excluded from this summary table. Information including P values of the statistical analysis in our study and the list of the previous studies is provided in Multimedia Appendix 1.
| Feature category | Consistent (our study–previous studies) | Inconsistent (our study–previous study) | ||
|
| Sig-sig | Nonsig-nonsig | Sig-nonsig | Nonsig-sig |
| Part of speech (12) |
Noun frequency Auxiliary verb frequency Noun ratio Verb ratio Pronoun ratio Auxiliary verb ratio Conjunction ratio Pronoun to noun ratio |
Adverb ratio |
Adjective ratio Noun to verb ratio |
Verb frequency |
| Vocabulary richness (3) |
Honoré’s statistic |
Type-token ratio Brunét’s index | — | — |
| Syntactic complexity (7) |
Mean length of sentence (utterance) |
Total no of words No of sentences (utterances) No of characters Total dependency distance in a document Avg dependency distance per sentence Total no of distance in a document | — | — |
| Perseveration (2) | — | — | — |
Avg cosine distance Cosine cutoff: 0.50 |