Literature DB >> 35059492

Automated text-level semantic markers of Alzheimer's disease.

Camila Sanz¹, Facundo Carrillo², Andrea Slachevsky^3,4,5,6,7, Gonzalo Forno^5,8,9, Maria Luisa Gorno Tempini¹⁰, Roque Villagra^4,7, Agustín Ibáñez^11,12,13,14, Enzo Tagliazucchi^1,11, Adolfo M García^12,13,14,15.

Abstract

INTRODUCTION: Automated speech analysis has emerged as a scalable, cost-effective tool to identify persons with Alzheimer's disease dementia (ADD). Yet, most research is undermined by low interpretability and specificity.
METHODS: Combining statistical and machine learning analyses of natural speech data, we aimed to discriminate ADD patients from healthy controls (HCs) based on automated measures of domains typically affected in ADD: semantic granularity (coarseness of concepts) and ongoing semantic variability (conceptual closeness of successive words). To test for specificity, we replicated the analyses on Parkinson's disease (PD) patients.
RESULTS: Relative to controls, ADD (but not PD) patients exhibited significant differences in both measures. Also, these features robustly discriminated between ADD patients and HC, while yielding near-chance classification between PD patients and HCs. DISCUSSION: Automated discourse-level semantic analyses can reveal objective, interpretable, and specific markers of ADD, bridging well-established neuropsychological targets with digital assessment tools.

Entities: Chemical

Keywords: Alzheimer's disease dementia; Parkinson's disease; automated speech analysis; semantic granularity; semantic variability

Year: 2022 PMID： 35059492 PMCID： PMC8759093 DOI： 10.1002/dad2.12276

Source DB: PubMed Journal: Alzheimers Dement (Amst) ISSN： 2352-8729

INTRODUCTION

Over 43 million individuals are affected by Alzheimer's disease (AD), a disorder characterized by progressive temporo‐parieto‐hippocampal atrophy alongside semantic and episodic memory impairments. , , Given its high disability and mortality rate, its growing economic burden, and its expert‐dependent diagnosis, , a call has been raised for objective, scalable, low‐cost approaches favoring disease identification and characterization. Prominent among these is automated speech analysis (ASA). Participants are simply required to speak, yielding diverse features that can be automatically extracted and analyzed to detect persons with and without AD dementia (ADD). Yet, most research is undermined by low interpretability and specificity, often targeting features unrelated to the disorder's core neuropsychological profile while lacking a disease control group. This may cast doubt on the clinical utility of ensuing findings. Here, leveraging ASA tools with ADD patients, healthy controls (HCs), and Parkinson's disease (PD) patients, we examined whether ADD‐specific markers can be captured through measures of semantic granularity and ongoing semantic variability, two domains that are systematically disrupted in standard assessments. , , ASA has proven useful for discriminating between AD patients and HCs, predicting dementia onset, , and differentiating among autopsy‐proven disease subtypes. Yet, most studies have examined heterogeneous ad hoc domains, revealing patterns that are not readily interpretable against core neuropsychological outcomes. For instance, inconsistent accuracy rates are obtained upon targeting mixed articulatory and syntactic dimensions that are typically spared in early testing. , , Also, few studies have included a neurodegenerative control group, prompting questions about the specificity of findings. Moreover, several reports have used unmatched and imbalanced groups, restricted tasks eliciting little data, and suboptimal machine learning approaches. The present study tackles these issues. We investigated disruptions of semantic granularity and ongoing semantic variability, two well‐established manifestations of ADD (Figure 1A). AD patients are typified by coarse (ie, general) conceptual choices, evincing a propensity to use hypernyms (eg, “animal,” “fruit”) and few hyponyms (eg, “cat,” “berry”). , Also, they exhibit sudden changes in speech flow, as their discourse becomes progressively digressive, with frequent interruptions and inquiries (eg, “What was I saying?”) causing conceptual discontinuity. , Our approach captures these phenomena automatically. We employed the WordNet taxonomy to quantify word‐by‐word semantic granularity (Figure 1B) and FastText embedding to measure ongoing semantic variability across successive word pairs (Figure 1C). Furthermore, to test whether such domains are distinctively affected in ADD, we included persons with PD, a neurodegenerative disease with early semantic alterations restricted to particular domains—mainly, action‐related concepts. , Finally, we circumvented key caveats in the literature. First, we formed strictly matched groups with similar sample sizes. Second, we combined several speech tasks in an integrative analysis, capturing various language behaviors and avoiding inflated results based on unduly small datasets (a key requisite for testing novel metrics). Third, we employed robust machine learning methods for patient identification.

FIGURE 1

Illustration of target measures. (A) Representative phrases of ADD patients, PD patients, and healthy controls, showing the predicted gradient of semantic granularity (red scale) and ongoing semantic variability (blue scale). (B) Segment of the WordNet network showing hierarchical relations from the least granular node ("entity") to progressively more granular nodes (down to "bulldog"). Granularity values are marked by color and number. Nodes serving as starting points of dotted lines show network bifurcations that do not lead to the "bulldog" node. Multiple relevant and intermediate nodes are omitted for brevity. (C) Schemes for the computation of ongoing semantic variability. The diagrams show FastText embeddings, adjacent‐word‐pair similarity series, and distributions for texts presenting high variability (top row), middle variability (middle row), and low variability (bottom row). Abbreviations: ADD, Alzheimer's disease dementia; PD, Parkinson's disease Briefly, we performed the first automated assessment of semantic granularity and variability on ADD patients, relative to HCs and PD patients. We integrated statistical (analysis of variance [ANOVA]) and machine learning (Gradient Boosting) analyses on a rich, diverse set of language tasks. We hypothesized that automated measures of semantic granularity and ongoing semantic variability would yield (1) significant differences, and (2) high classification accuracy between ADD patients and HCs, but (3) not between PD patients and HCs. With this approach, we seek to better test the sensitivity and clinical utility of ASA for dementia assessments.

RESEARCH IN CONTEXT

Systematic review: Through a thorough PubMed search, we reviewed the strengths and limitations of automated speech analysis (ASA) research on Alzheimer's disease dementia (ADD). Crucially, most studies targeted features unrelated to the disorder's core neuropsychological profile and lacked disease control groups. Interpretation: Our findings show that ASA can capture interpretable condition‐specific markers of ADD. Compared with controls, ADD (but not Parkinson's disease [PD]) patients exhibited significant reductions of semantic granularity and increased semantic variability across speech tasks. Machine learning analyses yielded robust classification of ADD patients (receiver operating characteristic, area under the curve [AUC] = 0.8), but not PD patients (AUC = 0.65), relative to controls. Thus, ASA emerges as an affordable and scalable method to support ADD diagnosis. Future directions: These proposed markers should be examined in larger cohorts (to test their systematicity), in longitudinal designs (to assess their sensitivity to disease progression), and in cross‐linguistic studies (to favor more global validations of ASA).

HIGHLIGHTS

We examined markers of Alzheimer's disease (AD) via automated speech analysis. We targeted semantic granularity and variability, two clinically sensitive domains. Relative to controls, AD patients were impaired in and classified by both measures. These results were not replicated in PD patients. Our approach can reveal scalable, interpretable, condition‐specific markers of AD.

METHODS

Participants

We recruited 55 native Spanish speakers, with normal or corrected‐to‐normal hearing, from the Memory and Neuropsychiatry Clinic, hosted by Universidad de Chile and Hospital del Salvador, Chile. The sample comprised 21 ADD patients, 18 PD patients, and 16 HCs, reaching adequate power (Appendix A). Patients were diagnosed by expert neurologists following the National Institute of Neurological and Communicative Diseases and Stroke‐Alzheimer's Disease and Related Disorders Association clinical criteria for AD, and the United Kingdom Parkinson's Disease Society Brain Bank standards for PD. As in previous works, , , , diagnoses were supported by extensive neurological, neuropsychological, and neuroimaging examinations. No patient reported a history of other neurological disorders, psychiatric conditions, primary language deficits, or substance abuse. Mean scores on the Montreal Cognitive Assessment fell below the cutoffs for dementia in the ADD group and for mild cognitive impairment in the PD group. ADD patients presented executive dysfunction, as established through the INECO Frontal Screening battery. PD patients had no symptoms of Parkinson‐plus and were assessed in the “on” phase of medication. HCs were cognitively preserved, functionally autonomous, and had no background of neuropsychiatric disease or drug abuse. All groups were matched for sex, age, and education. For demographic and neuropsychological details, see Table 1.

TABLE 1

Participants’ demographic and neuropsychological information

	ADD (n = 21)	PD (n = 18)	Controls (n = 16)	Statistics (all groups)	Pairwise comparisons
	ADD (n = 21)	PD (n = 18)	Controls (n = 16)	Statistics (all groups)	Groups	MSE	P‐value
Demographic data
Sex (F:M)	13:8	10:8	13:3	χ² = 4.86 P = .1 ^a	—–	—–	—–
Age	77.24 (6.47)	76.50 (6.40)	75.94 (4.35)	F = 0.21 P = .81 ^b	—–	—–	—–
Years of education	11.24 (3.78)	9.39 (5.11)	12.94 (4.28)	F = 2.62 P = .08 ^b	—–	—–	—–
Neuropsychological data
MoCA	13.90 (4.34)	20.33 (4.68)	25.07 (3.43)	F = 29.01 P < .001 ^b	ADD vs HCs PD vs HCs ADD vs PD	12.75 29.39 23.27	< .001 ^c .006 ^c < .001 ^c
IFS battery	11.07 (4.48)	17.08 (4.86)	18.90 (4.26)	F = 14.30 P < .001 ^b	ADD vs HCs PD vs HCs ADD vs PD	13.85 57.72 18.98	< .001 ^c .51 ^c < .001 ^c

Abbreviations: ADD, Alzheimer's disease dementia; PD, Parkinson's disease; MoCA, Montreal Cognitive Assessment; IFS, INECO Frontal Screening battery.

Data presented as mean (SD), with the exception of sex.

P‐values calculated via chi‐squared test (χ2).

P‐values calculated via independent measures ANOVA.

P‐values calculated via Tukey's HSD post hoc tests.

Participants’ demographic and neuropsychological information Sex (F:M) χ2 = 4.86 P = .1 77.24 (6.47) 76.50 (6.40) 75.94 (4.35) F = 0.21 P = .81 11.24 (3.78) 9.39 (5.11) 12.94 (4.28) F = 2.62 P = .08 13.90 (4.34) 20.33 (4.68) 25.07 (3.43) F = 29.01 P < .001 ADD vs HCs PD vs HCs ADD vs PD 12.75 29.39 23.27 < .001 .006 < .001 11.07 (4.48) 17.08 (4.86) 18.90 (4.26) F = 14.30 P < .001 ADD vs HCs PD vs HCs ADD vs PD 13.85 57.72 18.98 < .001 .51 < .001 Abbreviations: ADD, Alzheimer's disease dementia; PD, Parkinson's disease; MoCA, Montreal Cognitive Assessment; IFS, INECO Frontal Screening battery. Data presented as mean (SD), with the exception of sex. P‐values calculated via chi‐squared test (χ2). P‐values calculated via independent measures ANOVA. P‐values calculated via Tukey's HSD post hoc tests. All participants provided written informed consent pursuant to the Declaration of Helsinki. The study was approved by the institutional ethics committee of the Memory and Neuropsychiatric Clinic, Neurology Department, Hospital del Salvador (7500000), SSMO & Faculty of Medicine, University of Chile.

Speech elicitation protocol

Participants performed seven naturalistic language tasks covering varied communicative behaviors. Four were spontaneous speech tasks, requiring participants to describe (1) their daily routine and (2) main interests, and to narrate (3) a pleasant and (4) an unpleasant memory. In these, discourse is driven by personal experience, allowing for varied linguistic patterns. The remaining three were semi‐spontaneous speech tasks, involving descriptions of (5) the modified Picnic Scene of the Western Aphasia Battery and (6) a picture of a family working in an unsafe kitchen, as well as (7) immediate recall and narration of a one‐minute silent animated film. These tasks elicit diverse and partly predictable linguistic patterns. Recordings were obtained in a quiet room on laptop computers with noise‐cancelling microphones, and saved as .wav files (44100 Hz, 16 bits) via Cool Edit Pro 2.0. Normal pace and volume were encouraged. Recordings were transcribed via an automatic speech‐to‐text service and manually revised. The rare occurrences of unintelligible words were discarded.

Speech data pre‐processing

Transcriptions were pre‐processed on Python's TreeTagger library with the AnCora Spanish corpus (http://clic.ub.edu/corpus/es/ancora). We converted all characters to lowercase and remove all punctuation marks and symbols. , Each text was split into individual words. These were assigned part‐of‐speech tags and lemmatized (ie, converted to their base form). To maximize statistical power and feature diversity while capturing multiple linguistic scenarios, analyses were performed collapsing all tasks. Mean lemmatized word counts did not differ significantly (F[2,52] = 0.64, P = .53, = 0.02) among ADD patients (1,051; SD = 112), HCs (1,239; SD = 124), and PD patients (1,193; SD = 140).

measures

Semantic granularity

Granularity scores were computed via Python's NLTK library (https://www.nltk.org/) as interface to access WordNet's lexical database in English (https://wordnet.princeton.edu/). WordNet includes over 155,000 words organized in synonym sets called "synsets." Roughly 80,000 correspond to nouns. These are grouped into a taxonomy that can be visualized as a hierarchical (direct, acyclic, non‐weighted) graph spanning hypernyms from above (eg, "animal") and hyponyms from below (eg, "dog"). The highest hypernym is "entity," with progressively less coarse terms appearing downstream (Figure 1B). A noun's granularity can be defined as the number of nodes separating it from "entity." Accordingly, general terms like "food" or "animal" have lower granularity scores than more precise terms such as "carrot" or "bulldog." Nouns were automatically identified with TreeTagger (Section 2.3), manually checked to avoid erroneous tagging, and automatically translated into English using WordNet. Granularity scores were assigned to each noun by considering its shortest path to "entity" (ie, the "synset" with fewer nodes to "entity"). Nouns not included in WordNet's corpus (∼ 5.68% across texts) were discarded (rejected nouns did not differ significantly among groups, P = .93). For subsequent analyses, scores were stored in lists and converted to histograms using bins of increasing granularity, from 2 to 12 (bins 2, 3, and 4 reflect the number of nouns with granularity scores 2, 3, and 4, respectively, and so on). Bin 1 was not considered, since the word "entity" was not present in any text. Bin 12 included all words with granularity score 12 and the very few words with higher granularity (∼0.18% across texts). To avoid verbosity‐related confounds, bins were normalized by the total number of nouns.

Ongoing semantic variability

Ongoing semantic variability was analyzed with a FastText model (https://fasttext.cc/) pre‐trained with over 2,000,000 unique Spanish words from Common Crawl and Wikipedia corpora. The FastText model assigns a vector to each unique word in the vocabulary and is trained to map similar concepts to vectors that are close within the embedding. The distance between words can be quantified with the cosine of the angle between their assigned vectors: , for two vectors and . As in previous works, , the vector embedding was used to compute each text's ongoing semantic variability (Figure 1C). First, each pre‐processed text was represented as a series of vectors, , preserving the words’ sequential order. Second, the distances between adjacent vectors, were stored into a time series. Third, ongoing semantic variability was computed as the variance of the joint time series across speech tasks: , with representing the mean of all . Thus, when adjacent words referred to concepts far apart in the embedding space, a text was typified by high semantic variability, reflecting discontinuous discourse. To avoid biases driven by disfluencies, hesitations, or word‐finding strategies, consecutive repeated words were omitted before the second step (a text consisting of a single repeated word would feature null variability). Ultimately, each participant's semantic variability across tasks was used for ANOVA and as a feature for machine learning analyses.

Statistical analysis

Between‐group comparisons were performed via one‐way ANOVAs, with Tukey's HSD tests for post hoc contrasts. Alpha levels were set at P < .05. Effect sizes were computed via partial eta squared () for ANOVAs and with Cohen's d for pairwise comparisons. Given their different distributions and variances, each of the 12 measures (the 11 granularity bins, and the global measure of semantic variability) was framed as a separate dependent variable. No participant was detected as an outlier in any measure. Analyses were performed with Pingouin Python library (https://pingouin‐stats.org/).

Machine learning analysis

We implemented machine learning classifiers between ADD patients and HCs (to reveal candidate ADD markers) and between PD patients and HCs (to test whether predicted markers proved specific to ADD). A single model was trained for each contrast using the corresponding histograms of granularity and variability scores as input features. Analyses were based on a Gradient Boosting classifier, which surpasses the robustness of other algorithms. , Scikit‐learn (https://scikit‐learn.org/) was used to implement the classifiers with 5000 independent estimators, a learning rate of 0.01, and a maximum of two features per split. For each iteration, data were randomly divided into three folds preserving the proportion of labels (stratified cross‐validation). Two folds were used for training and the other for testing, so that all folds were used once to test the classifier. Univariate feature selection was applied to the training set within each fold (the top five features were selected according to their ANOVA F‐value between groups). This process was repeated 1000 times with and without shuffling the target labels, and a P‐value was constructed by counting the number of times the area under the receiving operator characteristic curve (ROC AUC) value of the classifier with shuffled labels exceeded that obtained without shuffling, normalized by the total number of iterations. A feature importance score was constructed by counting the number of times a feature was selected based on its F‐value (Appendix B), divided by the number of folds multiplied by the number of iterations. Importantly, the number of features per participant (n = 12) was more than four times smaller than the number of participants (n = 55), and the feature selection procedure further reduced the number of features to five. This feature‐to‐sample ratio, combined with the stratified cross‐validation procedure, contributed to alleviate potential overfitting issues. Classifier performance is reported as the mean and SD (extent of the shaded region) of the ROC curve across all 1000 iterations, both for shuffled and unshuffled labels, and as confusion matrices showing the proportion of correct/incorrect classifications in each class.

RESULTS

Statistical results

ADD patients exhibited lower semantic granularity scores than HCs and PD patients in most of the largest bins (8‐12), indicating scarcer use of hyponyms (Figure 2A). Significant group differences were found for bins 5 (F[2,52] = 5.43, P = .007, = 0.17) and 11 (F[2,52] = 4.71, P = .013, = 0.15). Post hoc analyses, via Tukey's HSD tests, revealed that ADD patients scored significantly higher than HCs in bin 5, a low granularity bin (P = .072, d = 0.73); and significantly lower than HCs in bin 11, a high granularity bin (P = .008, d = 1). Bin 5 also yielded significantly higher scores for PD patients than HCs (P = .003, d = 1.11). The remaining pairwise comparisons yielded non‐significant differences (all P‐values > .05).

FIGURE 2

Statistical differences in semantic granularity and ongoing semantic variability across diverse speech tasks. (A) Normalized values of semantic granularity for each bin. Relative to controls, ADD patients exhibited higher values in a low granularity bin (5) and lower values in a high granularity bin (11), suggesting greater reliance on hypernyms and reduced reliance on hyponyms. (B) Boxplot representation of ongoing semantic variability. Successive semantic choices proved significantly more variable in ADD patients than in HCs. Significant pairwise differences (P < .05) are indicated with a single asterisk (*) for the contrast between ADD patients and HCs, and with a double asterisk (**) for the contrast between PD patients and HCs. Abbreviations: ADD, Alzheimer's disease dementia; HCs, healthy controls; PD, Parkinson's disease Ongoing semantic variability results (Figure 2B) yielded a significant group effect (F[2,52] = 4.24, P = .02, = 0.14), with post hoc comparisons revealing significantly higher scores for ADD patients than HCs (P = .011, d = 0.97), alongside non‐significant differences for the remaining pairwise comparisons (HCs vs PD patients: P = .21, d = 0.58; ADD vs PD patients: P = .45, d = 0.39).

Machine learning results

Collapsing both measures, classification between ADD patients and HCs (Figure 3A) reached an AUC of .80 ± .06 (accuracy: .71 ± .11; sensitivity: .80 ± .15; precision: .73 ± .12). This AUC value was significantly higher (P = .022) than that obtained upon shuffling participants’ labels, which yielded chance levels (0.49 ± .13) and lower scores across measures (accuracy: .52 ± .16; sensitivity: .64 ± .21; precision: .57 ± .18).

FIGURE 3

Classifications between patients and controls combining semantic granularity and ongoing semantic variability features across diverse speech tasks. The Gradient Boosting classifier successfully distinguished (A) ADD patients from HCs, but not (B) PD patients from HCs. The panels show normalized AUC histograms (left inset), average ROC curves (middle inset), and confusion matrices normalized by row and averaged across iterations (right inset). Real results are shown in blue, while results obtained upon shuffling participants’ labels are shown in red. Abbreviations: ADD, Alzheimer's disease dementia; AUC, area under the curve; HCs, healthy controls; PD, Parkinson's disease; ROC, receiver operating characteristic Conversely, classification between PD patients and HCs (Figure 3B) yielded an AUC of .65 ± .08 (accuracy: .60 ± .12; sensitivity: .61 ± .20; precision: .64 ± .15). This AUC value did not differ significantly (P = .16) from that obtained upon shuffling participants’ labels, which yielded chance values (.50 ± .13) and chance‐level results in other measures (accuracy: .50 ± .15; sensitivity: .56 ± .23; precision: .53 ± .20).

DISCUSSION

We examined potential markers of ADD via automated measures of semantic granularity and variability. Both measures discriminated ADD patients from HCs (based on ANOVAs) and allowed identifying them robustly on a subject‐level basis (based on machine learning). No such differentiations were present for PD patients relative to HCs. Below we discuss these findings. Relative to HCs, ADD patients used more coarse and fewer precise concepts. This indicates reduced semantic granularity, a phenomenon observed in controlled tasks (eg, picture naming, category fluency) through standard measures (eg, correct responses). Our study suggests that increased reliance on hypernyms in ADD also typifies the patients’ natural speech. In this sense, reduced granularity has been proposed as a marker of diseases with primary semantic memory impairments. Indeed, abnormally coarse‐grained abstractions are also typical in semantic dementia patients, some of whose core atrophy regions (eg, hippocampus, temporal lobes) are also affected in AD. Accordingly, although several granularity bins showed substantial overlap between ADD patients and HCs, our automated granularity measure might capture subtle but informative disruptions. ADD patients also presented greater semantic variability than HCs, indicating more discontinuous speech (eg, see Appendix C). Previous studies have reported reduced cohesion and coherence in AD, , for example, by counting digressive utterances (or words) or unrelated adjacent utterances. , Similar patterns are observed in persons with mild cognitive impairment, at increased risk for AD. Our study shows that dissimilar semantic relations also emerge across word‐to‐word relations. Specifically, the patients’ discourse abounded in interruptions and gap fillers via ready‐made phrases (eg, "I don't know," "I forget the name," "I don't remember"), in line with evidence that this population may overuse formulaic language. Here, the FastText word‐vectorial representations revealed that such phrases deviate from their adjacent semantic choices, revealing further neuropsychological aspects of ADD. The robustness of both measures was corroborated by machine learning results. Joint analysis of semantic granularity and variability features yielded an AUC of 80%, correctly identifying 80% of HCs and 74% of ADD patients. These results surpass those from previous ASA studies targeting domains that are not markedly affected in AD, such as articulation or syntax. , , Importantly, classification results were near chance upon shuffling participants’ labels, indicating that these features do capture distinguishing properties of ADD rather than fortuitous differences between random samples. Briefly, semantic granularity and variability measures may contribute to revealing clinically relevant differences between ADD patients and HCs. Importantly, the above results were partly specific to ADD. Except for one granularity bin, the features affected in ADD were preserved in PD. Likewise, classification between PD patients and HCs was near chance and non‐significantly different from that obtained via random groupings. This is a non‐trivial finding, since other verbal domains more systematically assessed in AD, such as semantic and phonemic fluency, are also frequently compromised in PD, limiting their use for disease differentiation. Yet, while we targeted PD patients on levodopa, as in previous works, semantic alterations in this disease are sensitive to medication status. New studies should explore whether the deficits observed in ADD remain specific when considering PD patients with varying levels of dopamine bioavailability. Still, our findings suggest that theoretically informed semantic measures may prove useful not only to identify specific brain diseases, but also to discriminate among them. Previous ASA studies have often assessed unmotivated, heterogeneous domains in combination with feature importance techniques that favor classification outcomes over interpretability. While often successful in terms of classifier performance, this approach fails to capture features that can be readily aligned with mainstream clinical knowledge. In fact, diverse constellations of phonological and syntactic features might contribute to patient identification , while challenging straightforward neuropsychological interpretation. Moreover, this evidence is hard to reconcile with abundant neuropsychological literature attesting to the preservation of such domains in AD. , , In contrast, we first identified linguistic domains consistently affected by the disease and then developed a pipeline to track them in natural discourse. By bridging the gap between well‐established deficits and cutting‐edge automated tools, our approach paves the way for more clinically relevant uses of ASA. Moreover, our design overcomes key limitations of previous ASA research on AD and related diseases. Frequently, these studies are undermined by unbalanced samples , and by poor or null control of sociodemographic confounds, such as sex, age, and education. , Our strict group‐matching protocol circumvented major alternative explanations of our results (ie, higher education levels could entail richer vocabulary, potentially increasing semantic granularity). Moreover, while most previous works used isolated tasks or narrow combinations therefrom, we used a range of spontaneous (autobiographical) and semi‐spontaneous (stimulus‐based) tasks, covering a rich repertoire of daily linguistic behaviors. Critically, this approach increases data quantity and variability across groups, avoiding over‐optimistic results from brief discourse samples. While we obtained similar results even upon considering a single task—the one of longest duration (Appendix, section D)—the present approach avoids important caveats while maximizing the representativeness of ASA. This study attests to the usefulness of ASA as a complement for mainstream AD assessments. Standard evaluations of neurodegenerative conditions may prove expensive, yield examiner‐driven scores, and overlook spontaneous behavior. Conversely, ASA entails minimal costs, generating objective naturalistic data. , Furthermore, speech tasks can be administered remotely, maximizing accessibility and equity for persons with reduced mobility or capacity to afford transportation costs. These possibilities open exciting avenues to further test our measures. Yet, our work is not without limitations. First, although groups were balanced and in keeping with the field's typical Ns, their sizes were small. While this is a common hurdle in studies pursuing standardized, good‐quality speech samples, it would be important to replicate our work with more participants. Second, while the use of several speech tasks allows capturing diverse linguistic behaviors, it also increases test duration. This can be attenuated by having participants record themselves remotely, which could be especially promising for longitudinal assessments. Third, our study focused exclusively on Spanish speakers. Given that different languages may become differently affected by the same disease, cross‐linguistic studies would be critical towards more global approaches to ASA. In sum, this study shows that ASA can be leveraged to yield differential and interpretable markers of ADD across diverse linguistic behaviors. ADD patients seem typified by reduced semantic granularity and higher ongoing semantic variability, both patterns being absent in PD patients. By further targeting well‐established linguistic aspects of ADD through customized methods, ASA may boost the development of digital markers of dementia. Supporting information Click here for additional data file.

38 in total

1. Linguistic Features Identify Alzheimer's Disease in Narrative Speech.

Authors: Kathleen C Fraser; Jed A Meltzer; Frank Rudzicz
Journal: J Alzheimers Dis Date: 2016 Impact factor: 4.472

2. 'Normal' semantic-phonemic fluency discrepancy in Alzheimer's disease? A meta-analytic study.

Authors: Keith R Laws; Amy Duncan; Tim M Gale
Journal: Cortex Date: 2009-05-18 Impact factor: 4.027

3. Prediction of psychosis across protocols and risk cohorts using automated language analysis.

Authors: Cheryl M Corcoran; Facundo Carrillo; Diego Fernández-Slezak; Gillinder Bedi; Casimir Klim; Daniel C Javitt; Carrie E Bearden; Guillermo A Cecchi
Journal: World Psychiatry Date: 2018-02 Impact factor: 49.548

4. The nature of semantic memory deficits in Alzheimer's disease: new insights from hyperpriming effects.

Authors: B Giffard; B Desgranges; F Nore-Mary; C Lalevée; V de la Sayette; F Pasquier; F Eustache
Journal: Brain Date: 2001-08 Impact factor: 13.501

5. INECO Frontal Screening (IFS): a brief, sensitive, and specific tool to assess executive functions in dementia.

Authors: Teresa Torralva; María Roca; Ezequiel Gleichgerrcht; Pablo López; Facundo Manes
Journal: J Int Neuropsychol Soc Date: 2009-07-28 Impact factor: 2.892

6. Formulaic Language in People with Probable Alzheimer's Disease: A Frequency-Based Approach.

Authors: Vitor C Zimmerer; Mark Wibrow; Rosemary A Varley
Journal: J Alzheimers Dis Date: 2016-06-30 Impact factor: 4.472

7. Parkinson's disease compromises the appraisal of action meanings evoked by naturalistic texts.

Authors: Adolfo M García; Yamile Bocanegra; Elena Herrera; Leonardo Moreno; Jairo Carmona; Ana Baena; Francisco Lopera; David Pineda; Margherita Melloni; Agustina Legaz; Edinson Muñoz; Lucas Sedeño; Sandra Baez; Agustín Ibáñez
Journal: Cortex Date: 2017-07-17 Impact factor: 4.027

Review 8. Alzheimer's disease.

Authors: Philip Scheltens; Bart De Strooper; Miia Kivipelto; Henne Holstege; Gael Chételat; Charlotte E Teunissen; Jeffrey Cummings; Wiesje M van der Flier
Journal: Lancet Date: 2021-03-02 Impact factor: 79.321

9. Automatic speech analysis for the assessment of patients with predementia and Alzheimer's disease.

Authors: Alexandra König; Aharon Satt; Alexander Sorin; Ron Hoory; Orith Toledo-Ronen; Alexandre Derreumaux; Valeria Manera; Frans Verhey; Pauline Aalten; Phillipe H Robert; Renaud David
Journal: Alzheimers Dement (Amst) Date: 2015-03-29

10. The Experience Elicited by Hallucinogens Presents the Highest Similarity to Dreaming within a Large Database of Psychoactive Substance Reports.

Authors: Camila Sanz; Federico Zamberlan; Earth Erowid; Fire Erowid; Enzo Tagliazucchi
Journal: Front Neurosci Date: 2018-01-22 Impact factor: 4.677