| Literature DB >> 33990615 |
Reno Kriz1, Sunghye Cho2, Sunny X Tang3,4,5, Suh Jung Park6, Jenna Harowitz6, Raquel E Gur6, Mahendra T Bhati6,7, Daniel H Wolf6, João Sedoc8, Mark Y Liberman2,9.
Abstract
Computerized natural language processing (NLP) allows for objective and sensitive detection of speech disturbance, a hallmark of schizophrenia spectrum disorders (SSD). We explored several methods for characterizing speech changes in SSD (n = 20) compared to healthy control (HC) participants (n = 11) and approached linguistic phenotyping on three levels: individual words, parts-of-speech (POS), and sentence-level coherence. NLP features were compared with a clinical gold standard, the Scale for the Assessment of Thought, Language and Communication (TLC). We utilized Bidirectional Encoder Representations from Transformers (BERT), a state-of-the-art embedding algorithm incorporating bidirectional context. Through the POS approach, we found that SSD used more pronouns but fewer adverbs, adjectives, and determiners (e.g., "the," "a,"). Analysis of individual word usage was notable for more frequent use of first-person singular pronouns among individuals with SSD and first-person plural pronouns among HC. There was a striking increase in incomplete words among SSD. Sentence-level analysis using BERT reflected increased tangentiality among SSD with greater sentence embedding distances. The SSD sample had low speech disturbance on average and there was no difference in group means for TLC scores. However, NLP measures of language disturbance appear to be sensitive to these subclinical differences and showed greater ability to discriminate between HC and SSD than a model based on clinical ratings alone. These intriguing exploratory results from a small sample prompt further inquiry into NLP methods for characterizing language disturbance in SSD and suggest that NLP measures may yield clinically relevant and informative biomarkers.Entities:
Year: 2021 PMID: 33990615 PMCID: PMC8121795 DOI: 10.1038/s41537-021-00154-3
Source DB: PubMed Journal: NPJ Schizophr ISSN: 2334-265X
Sample Characteristics.
| HC | SSD | Cohen’s | ||
|---|---|---|---|---|
| 11 | 20 | |||
| Cohort | 0.10 | |||
| Cohort 1 | 5 | 15 | ||
| Cohort 2 | 6 | 5 | ||
| Age (mean years ± SD) | 35.6 ± 5.8 | 36.5 ± 7.2 | 0.75 | 0.12 |
| Sex ( | ||||
| Female | 7 (64%) | 9 (45%) | 0.32 | |
| Male | 4 (36%) | 11 (55%) | ||
| Race ( | 0.12 | |||
| African American | 3 (30%) | 13 (65%) | ||
| Asian | 0 (0%) | 1 (5%) | ||
| Caucasian | 7 (70%) | 6 (30%) | ||
| Education level | 15.8 ± 2.2 | 13.4 ± 2.5 | 0.01 | −1.00 |
| Recording duration (min) | 11.6 ± 2.2 | 12.7 ± 4.5 | 0.48 | 0.29 |
| Mean sentence length (words) | 17.5 ± 3.1 | 14.4 ± 4.3 | 0.04 | 0.81 |
| Word count | 1748.8 ± 448.0 | 1782.3 ± 908.2 | 0.92 | 0.04 |
| TLC Global Score | 0.0 ± 0.0 | 0.5 ± 1.0 | 0.13 | 0.56 |
| TLC Total Score | 0.9 ± 1.7 | 4.4 ± 9.2 | 0.10 | 0.46 |
| Next-sentence predictability | 0.96 ± 0.03 | 0.94 + 0.04 | 0.25 | -0.44 |
TLC global score is an overall impression of speech and language disturbance based on standard anchors. TLC total score is summed using the published formula, Total = 2*(Sum of items 1–11) + (Sum of items 12–18). Next-Sentence Predictability derived from BERT, with 0 indicating low predictability and 1 indicating high predictability. Additional details about the SSD participants are provided in Supplemental Table 1.
Bolded – categories; HC healthy control participants, SSD participants with schizophrenia spectrum disorder, TLC Scale for the Assessment of Thought Learning and Communication (Andreasen[26]).
Fig. 1Group effects on clinical language ratings and BERT next-sentence probability.
Individual scores with group median and interquartile range are shown. There were no significant group differences for (A) TLC Global Score (Cohen’s d = 0.55, p = 0.13), (B) TLC Total Score or Sum (Cohen’s d = 0.48, p = 0.10), and (C) BERT Next-Sentence Probability (Cohen’s d = −0.44, p = 0.25). The three SSD participants with TLC global scores ≥2 were identified as outliers.
Parts-of-speech frequencies in SSD and HC.
| HC ( | SSD ( | Cohen’s | ||
|---|---|---|---|---|
| Adverb | 10.65 (0.95) | 8.11 (1.76) | 0.001 | 1.66 |
| Determiner | 7.50 (0.96) | 6.53 (1.25) | 0.03 | 0.83 |
| Adjective | 7.10 (1.57) | 6.19 (0.78) | 0.03 | 0.82 |
| Pronoun | 11.77 (1.47) | 13.41 (2.67) | 0.03 | −0.71 |
| Preposition | 8.84 (1.37) | 7.97 (1.41) | 0.08 | 0.62 |
| Particle | 2.65 (0.52) | 2.35 (0.50) | 0.08 | 0.59 |
| Conjunction | 5.33 (1.35) | 4.61 (1.40) | 0.29 | 0.53 |
| Noun | 13.16 (0.93) | 13.67 (2.16) | 0.57 | −0.28 |
| Interjection | 6.07 (1.66) | 6.35 (2.45) | 0.75 | −0.12 |
| Verb | 19.34 (1.74) | 19.55 (2.60) | 0.82 | −0.09 |
HC healthy control participants, SSD participants with schizophrenia spectrum disorder, Adverb word that modifies an adjective or verb, Determiner determines the kind of reference for a noun, e.g. the, this, a, Adjective word that modifies a noun, Pronoun word that refers to the self or another noun mentioned elsewhere, e.g., I, she, them, Preposition word expressing relation to another clause, e.g. on, after, for, Particle function word providing meaning to associated words, e.g., to run, ate up, talk over, Conjunction connector word, e.g., and, but, because, Noun a person, place, thing, state, or quality, Interjection utterance expressing emotion, e.g., ouch, ugh, hey, Verb word expressing action, state, or relation.
P-values shown for ANCOVA tests, co-varying for education level, study cohort, and demographic variables (age, sex, and race).
Fig. 2Sentence embedding distance by interviewer-participant exchanges.
The average BERT sentence embedding difference between the original interviewer prompt and the participant’s response sentence, varying by distance from the original prompt. Responses from individuals with SSD began significantly farther from interviewer prompts relative to HC and traveled increasingly father away where those of HC did not. Fitting linear regressions to the trajectories, we find: SSD intercept = 0.260, 95% CI [0.257, 0.263]; SSD slope = 6.6e−4, 95% CI [2.6e−4, 1.1e−3]; HC intercept = 0.247, 95% CI [0.242–0.252]; HC slope = 1.5e−5, 95% CI [−6.2e−4, 6.5e−4]. The analysis was repeated excluding the 3 SSD outliers with high TLC scores and the results were consistent.
Fig. 3Discrimination between SSD and HC group status.
Naive Bayes models with leave-one-out cross-validation were constructed for (A) Clinical features alone, from the Scale for the Assessment of Thought Language and Communication (TLC), (B) Natural language processing-derived linguistic features alone, and (C) A combination of TLC and NLP-derived features. The NLP-only model performed better than the clinical-only model (accuracy 87% compared to 68%) and was similar to the model incorporating both NLP and clinical linguistic features (accuracy 81%).