Literature DB >> 29888073

Word Repetition in Separate Conversations for Detecting Dementia:A Preliminary Evaluation on Data of Regular Monitoring Service.

Abstract

For detecting early signs of dementia, monitoring technology has been actively investigated due to the low diagnostic coverage as well as the requirement for early intervention. Although language features have been used for detecting the language dysfunctions resulting from dementia in neuropsychological tests, features that can be extracted by regular conversations remain unexplored. Here, we propose a feature to characterize the atypical repetition of words on different days which is observed in patients with dementia. We tested it on data obtained from a daily monitoring service for eight elderly people, including two who had been diagnosed with dementia. We found that our feature outperformed the existing linguistic features used in previous studies, such as vocabulary richness and repetitiveness, in terms of effect size and AUC score. The results suggest that the use of our proposed feature holds promise for improving detection performance in everyday situations such as regular monitoring.

Entities: Disease Gene Species

Year: 2018 PMID： 29888073 PMCID： PMC5961820

Source DB: PubMed Journal: AMIA Jt Summits Transl Sci Proc

Introduction

As the worldwide elderly population increases, the incidence of dementia and Alzheimer’s disease (AD) is becoming an increasingly serious health and social problem. Alzheimer’s Disease International estimated 7% of the world’s population over 65 years old has AD or a related dementia[1]. It also reported that the worldwide cost of dementia may be as high as 605 billion USD a year, equivalent to 1% of the entire world’s gross domestic product[2]. Among the participating countries, Japan is one of those facing a severe aging problem and thus its problems in this regardare very serious. The prevalence of dementia for persons 65 years or older is estimated at around 15%[3]. The cost needed to address this problem in Japan is very high and reaches around 120 billion USD (14.5 trillion JPY)[4]. Thiscorresponds to 3% of the entire Japanese gross domestic product. These figures have led to an increasing focus on dementia in recent years. In particular, early diagnosis and intervention has been increasingly recognized as a possible way of improving dementia care, because of recent failures in both clinical trials and laboratory work in the stages of AD[5]. However, diagnostic coverage worldwide remains low[6]. Even in high-income countries, only 40-50% of people afflicted with dementia have received a diagnosis[6]. For example, Kotagal et al. suggested that 55% of people afflicted with dementia do not receive clinical cognitive evaluations in the United States[7]. Monitoring technology capable of detecting early signs of dementia and AD in everyday situations has great potential for supporting earlier diagnosis and intervention. Although there are already several projects and services for monitoring the health of elderly persons with frequent data collection by using mobile applications[8,9], whether and how we can exploit the data collected on a daily basis to detect dementia has been largely unexplored. To detect the health state of elderly persons afflicted with dementia in everyday situations, one of the most prominent candidates is identifying the evolution of language change over the course of dementia and AD. While memory im-pairment due to shrinking of the medial temporal lobe is the most typical symptom of dementia[10,11], both retrospective analyses and prospective cohort studies have shown that language problems are prevalent dating from presymptomatic periods[12,13]. In addition, language dysfunctions at the time of diagnosis from pathologically proven AD patients with postmortem examination have also been reported. Ahmed et al. showed that they exhibited significant lan-guage changes such as syntactic simplification and impairments in lexico-semantic processing[14,15]. On the basis of these findings, in previous computational work much attention has been paid to explore useful features that can beextracted from data gathered while participants performed neuropsychological tests by professionals such as medi-cal doctors[16,17]. They have found that the different stages of dementia and AD exhibit specific patterns of language changes in different domains such as phonetics and phonology, morphology, lexicon and semantics, syntax, and prag-matics[18]. Recently, studies on investigating whether these language dysfunctions observed in mental illness including dementia could be extracted under conditions close to those of everyday life have started and have garnered increas-ing attention[19,20]. Although daily monitoring makes it possible to extract language features at one time, comparing language patterns on different days is another possible approach but remains largely uninvestigated. In this study, we proposed a language feature of dementia that can be extracted by comparing language patterns on different days, and investigated whether it is useful for detecting dementia. To this end, we used conversational data obtained from a regular monitoring service for elderly people in Japan. We investigated the feature using real conversational data provided by a regular monitoring service, and showed how our proposed feature could be useful for detecting dementia.

Related Work

In this section, we will show related work that assisted us in aiming to determine the sort of features that can be extracted from conversations through regular monitoring. To this end, we will describe the acoustic and linguistic features in previous studies on speech data where individuals performed neuropsychological tests. Acoustic features have been widely used for quantifying the individual’s state including emotion, stress, and neu-rodegenerative diseases such as depression and dementia21-25. One reason for this is that these acoustic features are relatively easy to obtain compared with linguistic features requiring speech-to-text[26]. In the context of detecting de-mentia, acoustic features have been used for analyzing verbal fluency tasks or speeded-up word-list generation[25]. Previous studies reported that patients with dementia tend to increase their periods of silence as well as the number of pauses they use[25-28]. The short-term memory loss associated with dementia makes ordinary conversation difficult because of language dysfunctions such as word-finding and word-retrieval difficulties[29]. These language dysfunctions have typically been characterized by using linguistic features[30-32]. Picture descriptions were often used for these features and there are several large datasets for speech data in picture description tasks, such as DementiaBank[33-38]. Here we will provide a brief description of linguistic features widely used in these tasks, because the linguistic features of spontaneous speech used in them could be useful for characterizing everyday conversations. Syntactic complexity is closely associated with the incidence of dementia. For example, Kemper et al. showed that AD accelerates age-related deterioration in syntactic complexity compared with healthy controls[39]. Syntactic complexity was measured in various ways, such as the mean length of sentences, “part-of-speech” frequency, and dependency distance[25,30]. Dependency distance infers the number of intervening words between two syntactically related words in a sentence. Another category that must be considered is vocabulary richness. This category measures lexical diversity, which tends to reduce in dementia cases. It was calculated by three typical measures[40,41]: type-token ratio (TTR), Brunet’s index (BI), and Honore’s statistic (HS). TTR compares the total distinct word types (U) to the total word count (N) as TTR=U/N. Using the same U and N, BI is also defined as BI=N. Unlike other measures related to vocabulary richness, for this measure the vocabulary richness becomes greater as BI becomes smaller. HS gives particular importance to unique vocabulary items used only once, also known as hapax legomena (Nuni). HS is defined as HS=100logN/(1 - Nuni/U). A third feature category is repetitiveness. Some matrixes measure the frequency of repeated words and phrases, and others estimate sentence similarities by calculating the cosine distance between two sentences[30]. Another feature category, one widely used in image description tasks, is semantic density. It was calculated on the basis of “informational units” that are predefined objects or text segments that might refer to important information[42,43]. For example, in the Boston Cookie Theft picture description task, information units consist of objects such as “Woman”, “Cookies”, and “Boy taking the cookie”[42,43]. With the information units, semantic density can be defined as the number of information units divided by the total number of words[14,30,33]. A number of previous studies have found that individuals with dementia tend to produce speech with lower information, defined as semantic density, than with healthy controls[14,30,33]. Semantic density would be difficult to use in daily conversations, because it requires predefined information units. With regard to language dysfunctions observed in conversations in everyday situations, a previous study based on family reports pointed out atypical word repetition[44]. While this repetition was typically reported to occur in the same conversation, it also appears in separate conversations that may be held on different days[45]. Repetition in conversation across a number of days be assumed to represent memory impairment that prevents speakers from remembering recent conversations and degrading their social engagement ability by making them less able to expand on conversation topics. Therefore, in this study, we hypothesize that atypical repetition in conversation could be useful for detecting dementia in regular conversations. An attempt was made to infer that atypical repetition of words in a single conversation involved not only repetitiveness but also vocabulary richness because the repetitions could result in a small number of distinct words being used in a conversation[46]. Thus, it might be possible to capture atypical repetition of words in separate conversations by using repetitiveness and/or vocabulary richness.

Feature Related to Word Repetition in Conversations on Different Days

In this study, we focused on repetition in conversation as one of the prominent behaviors of dementia. Previous studies characterized atypical repetition of words by the frequency of repeated words or similarity between the sentences in a single conversation. Although descriptive analysis of AD clinical interviews has reported that repetition may also appear in separate conversations that may be held on different days[45], features that can be extracted from separate conversations remain largely uninvestigated. In this study, we propose a feature that aims to measure word repetition, especially in conversations on different days as addition to the single conversations on the same day. We first got pairs of conversational data Di and Dj separated by t days (T - M ≤ t ≤ T + M) days. HSi and HSj were extracted from Dj and Dj by calculating Honore’s statistic (HS). HS is defined as HS=100 log N/(1 - Nuni/U), where N is the total number of words, U is the total number of distinct word types, and Nuni is the number of total distinct word types used only once. Next, we defined Dj as a combined document of Di and Dj and extracted HSj as a feature of repetitiveness in conversation on different days. Finally, our feature R was calculated as follows: Weights wi, Wj and wij are hyper-parameters selected by parameter optimization. M was set to two days in this study.

Conversational data from a regular monitoring service

To test the proposed feature, we used daily conversation data obtained from a monitoring service for elderly people provided by Cocolomi Co., Ltd (http://cocolomi.net/). TThe purpose of their service is to help children to build a connection with their parent living alone by sharing the daily life information of elderly people, such as their physical condition. The communicator calls elderly people once or twice a week to have a daily conversation for about ten minutes. Each conversation is transcribed in spoken word format by the communicator and sent to the family by email (Figure 1). The spoken words of the communicator are eliminated.

Figure 1.

Overflow of regular monitoring service. A communicator calls an elderly customer once or twice a week, transcripts the conversations, and e-mails the transcripts to family members. In this study we analyzed the conversation transcripts this service provided.

The conversational data for analyzing were collected from eight Japanese people (five females and three males; age range 66-89 years, i.e., 82.37 ± 5.91 years old). Two of them were reported as suffering from dementia from the family. Table 1 shows the duration of the service, the number of report calls, the average duration of each call, and the average word length of each report. In total, 458,738 words were used for the analysis. All reports were written in Japanese. For preprocessing, we performed word segmentation, part-of-speech tagging, and word lemmatization on the conversation data. The words tagged as numerals and symbols were excluded from the analyzing data. For the preprocessing, we used the Japanese morphological analyzer MeCab[47].

Table 1:

Participant data list.

Results

We investigated our proposed feature R with conversational data obtained during the phone calls with the regular monitoring service. The feature aims to measure word repetition, especially in conversations that are separated by t days in addition to a single conversations conducted within a single day. First, we investigated whether our proposed feature could be used as a measure for discriminating dementia from controls. The discriminative power was measured by using both effect size (Cohen’s d) and area under the receiver operating characteristic curve (AUC-ROC). For Cohen’s d, the 0.8 effect size can be assumed to be large, while the 0.5 effect size is medium and the 0.2 effect size is small[48]. ROC is a graphical plot that illustrates the diagnostic ability of a binary classifier system that ranges from 0 to 1. We respectively set hyper-parameters wi, Wj, Wj and T as 0.125, 0.125,0.75, and 8, which were selected by exploratory experiments. As a result, we found that the proposed feature Rfor people with dementia was significantly higher than that of controls (p<1.0 x 10-24; Figure 2). We also obtainedthe effect size of 2.68 (95% confidential interval (CI): 2.11-3.25) and the AUC-ROC of 0.97 (Figure 3).

Figure 2.

Feature distributions for control and dementia in our proposed feature R and the existing five features used in previous studies. Boxes denote the 25th (Q1) and 75th (Q3) percentiles. The line within the box denotes the 50th percentile, while whiskers denote the upper and lower adjacent values that are the most extreme values within Q3+1.5(Q3-Q1) and Q1-1.5(Q3-Q1), respectively. Filled circles show outliers, and squares represent mean values.

Figure 3.

Comparison of our proposed feature R with the existing five features used in previous studies. Error bars are 95% confidence intervals.

We next compared R with other features extracted from single conversation that were typically used in previous studies: vocabulary richness, sentence complexity, and repetitiveness. As for vocabulary richness, we investigated Type-token ratio, Brunet’s index, and Honore’s statistic. For sentence complexity and repetitiveness, we respectively employed mean sentence length and sentence similarity. We computed the sentence similarity using cosine distance of sentences defined as TF-IDF (term frequency-inverse document frequency) vectors. Among the six features, we observed significant differences between control and dementia in three features: R, Honore’s statistic, and sentence similarity (p<1.0 x 10-24, p<5.0 x 10-20, p<5.0 x 10-6, respectively; Figure 2). In contrast, mean sentence length, type-token ratio and Brunet’s index had no significant difference between the groups (p > 0.05). The proposed feature R showed the best results in terms of effect size and ROC (d=2.68, ROC=0.97), followed by Honore’s Statistic (d=-1.36, ROC=0.86), and sentence similarity (d=0.69, ROC=0.72) (Figure 3). Our proposed feature R aims to characterize word repetition, especially in conversations on different days in addition to a single conversation on a single day. We investigated the usefulness of the feature representing the repetition in conversation on different days. We compared it with the feature extracted from single conversation. Specifically, the former feature was extracted from paired conversational data (i.e. HS-1 extracted from Dij), while the latter was extracted from single conversational data (i.e HS-1 extracted from Di (Dj)). The features extracted from paired conversations and single conversation both showed significant differences between control and dementia (p<1.0 x10-6 for single conversation; p<1.0 x 10-24 for paired conversation). We also found that the features extracted from paired conversations had larger discriminative power than those extracted from single conversations in terms of effectsize and ROC score (d=1.58, ROC = 0.86 for single conversation and d=2.67, ROC = 0.96 for paired conversation;Figure 4). The results suggest that the feature part extracted from paired conversational data contributes to detection performance of our proposed feature.

Figure 4.

Comparison between our proposed feature R, extracted from single and paired conversations. (A) Histogram and boxplot for each feature. (B) ROC-AUC scores.

As an additional analysis, we investigated the tendency of the proposed feature with different intervening days of paired conversations. Specifically, we compared the discriminative power of the feature extracted from paired conversation by changing T from three to 14 days. To define R for paired conversation, we used weight parameter Wj as 1 andothers as 0, because we wanted to investigate the relationship between duration T of the paired conversation and wordrepetitions. In all T values calculated in this study, the proposed R feature for people with dementia was significantly higher than that of controls (p<0.05; Table 2). The effect size and the AUC values respectively ranged from 1.5 to 2.67 and from 0.87 to 0.96. After they increased in the beginning, they peaked at around T = 8 (effect size of 2.67, 95% CI: 2.10-3.24; ROC = 0.96) and had a tendency to decline (Table 2 and Figure 5).

Table 2:

Discriminative power between control and dementia for our proposed feature R with different T.

Figure 5.

Effect size and AUC of our proposed feature R with different T. (A) Effect size of Cohen’s d. Error bar sare 95% confidence intervals. (B) AUC-ROC score.

Conculsion

In light of the increasing demand for detecting dementia in everyday situations, we focused on word repetition in separate conversations on different days on the basis of a previous descriptive study and proposed a feature to characterize it. To test our proposed feature, we investigated conversational data obtained from a regular monitoring service. The data the service provided was collected from eight elderly people, including two dementia patients, for a period of up to 30 months at the maximum. First, we found that our proposed feature has strong discriminating power and achieved up to 2.68 for effect size of Cohen’s d and 0.97 for AUC-ROC scores. We also compared our proposed feature with other linguistic features such as vocabulary richness and sentence similarities. As a result, our feature R outperformed other features, suggesting that the use of our feature in addition to already existing feature sets has promise to improve detection performance. We also analyzed feature extracted from single conversational data and paired conversational data. The results indicate that features from paired conversations on different days may be more advantageous in increasing discriminative power than extracting them from a single conversation. In addition, the results obtained in changing the intervening days of two separate conversational data suggest that our feature R could be especially useful when extracted from conversations of the specific intervening days. Our study has several limitations. One of the limitations of this study is the small sample of participants. Another limitation is its specific type of conversational data. We only analyzed the conversations from a regular monitoring service, where all conversations were over the telephone. We need to investigate our proposed feature with conversational data collected from everyday life situations such as face-to-face family conversations. In addition, our feature treated a binary state of dementia. The main reason is that the sample size of participants was too small to classify dementia in terms of its severity. In future work we will need to collect data from a larger number of participants, which will allow us to test whether or not our feature could be extended to score dementia on a scalar or ordinal scale. However, because to the best of our knowledge this is the first study which aims to characterize word repetitions in conversations on different days, we believe that using our feature will help make it possible to provide service for detecting dementia in daily monitoring.

25 in total

1. Longitudinal change in language production: effects of aging and dementia on grammatical complexity and propositional content.

Authors: S Kemper; J Marquis; M Thompson
Journal: Psychol Aging Date: 2001-12

Word Repetition in Separate Conversations for Detecting Dementia:A Preliminary Evaluation on Data of Regular Monitoring Service.

Introduction

Related Work

Feature Related to Word Repetition in Conversations on Different Days

Conversational data from a regular monitoring service

Results

Conculsion

1. Longitudinal change in language production: effects of aging and dementia on grammatical complexity and propositional content.

2. Detection of clinical depression in adolescents' speech during family interactions.

3. Cross-sectional analysis of Alzheimer disease effects on oral discourse in a picture description task.

Review 4. Effect size, confidence interval and statistical significance: a practical guide for biologists.

5. Spoken Language Derived Measures for Detecting Mild Cognitive Impairment.

6. Verbal Fluency and Early Memory Decline: Results from the Wisconsin Registry for Alzheimer's Prevention.

7. Empty speech in Alzheimer's disease and fluent aphasia.

8. Amnesic H.M.'s performance on the language competence test: parallel deficits in memory and sentence production.

9. The effects of very early Alzheimer's disease on the characteristics of writing by a renowned author.

10. Predicting mild cognitive impairment from spontaneous spoken utterances.