| Literature DB >> 31940347 |
Armen C Arevian1, Daniel Bone2, Nikolaos Malandrakis2, Victor R Martinez2, Kenneth B Wells1,3, David J Miklowitz1, Shrikanth Narayanan2.
Abstract
Individuals with serious mental illness experience changes in their clinical states over time that are difficult to assess and that result in increased disease burden and care utilization. It is not known if features derived from speech can serve as a transdiagnostic marker of these clinical states. This study evaluates the feasibility of collecting speech samples from people with serious mental illness and explores the potential utility for tracking changes in clinical state over time. Patients (n = 47) were recruited from a community-based mental health clinic with diagnoses of bipolar disorder, major depressive disorder, schizophrenia or schizoaffective disorder. Patients used an interactive voice response system for at least 4 months to provide speech samples. Clinic providers (n = 13) reviewed responses and provided global assessment ratings. We computed features of speech and used machine learning to create models of outcome measures trained using either population data or an individual's own data over time. The system was feasible to use, recording 1101 phone calls and 117 hours of speech. Most (92%) of the patients agreed that it was easy to use. The individually-trained models demonstrated the highest correlation with provider ratings (rho = 0.78, p<0.001). Population-level models demonstrated statistically significant correlations with provider global assessment ratings (rho = 0.44, p<0.001), future provider ratings (rho = 0.33, p<0.05), BASIS-24 summary score, depression sub score, and self-harm sub score (rho = 0.25,0.25, and 0.28 respectively; p<0.05), and the SF-12 mental health sub score (rho = 0.25, p<0.05), but not with other BASIS-24 or SF-12 sub scores. This study brings together longitudinal collection of objective behavioral markers along with a transdiagnostic, personalized approach for tracking of mental health clinical state in a community-based clinical setting.Entities:
Mesh:
Year: 2020 PMID: 31940347 PMCID: PMC6961853 DOI: 10.1371/journal.pone.0225695
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Overview of longitudinal assessment and modeling methods.
(A) The MyCoachConnect (MCC) system used to collect speech samples from patients calling into an interactive voice response application. Their providers then used a web application to review speech samples and submit global assessment ratings for each call (B) Comparison of two training methods used. The population-based machine learning model was trained using data from all participants in the study, excluding the test participant. Individualized machine learning model trained on participant’s own data, excluding the test speech sample.
Fig 2Patient-specific correlation patterns for speech features.
Correlation patterns between speech features and provider global assessment ratings for the top 25 features with the highest average correlation at the population level.
Sample characteristics.
| Participants | Total (n = 47) |
|---|---|
| Female (%) | 21 (45%) |
| Age (SD) | 51.1 (12.5) |
| White, Non-Hispanic | 24 (51%) |
| African American | 18 (38%) |
| Hispanic | 5 (11%) |
| Bipolar disorder | 14 (30%) |
| Schizophrenia | 13 (28%) |
| Schizoaffective | 14 (30%) |
| Major depressive disorder | 15 (32%) |
| 18 (38%) | |
| BASIS-24 (n = 42) | 1.6 (0.5) |
| Depression and Functioning | 1.7 (0.8) |
| Interpersonal Problems | 2.6 (1.0) |
| Self-Harm | 0.2 (0.7) |
| Emotional Lability | 1.8 (1.1) |
| Psychosis | 1.2 (1.1) |
| Substance Use | 0.3 (0.3) |
| MCS-12 (n = 39) | 42 (11.5) |
| PCS-12 (n = 39) | 39 (6.4) |
Acoustic and linguistic feature correlations with provider global assessment.
| Feature | Set | Functional | Correlation |
|---|---|---|---|
| Negative emotion words | LIWC | % words | -0.36 |
| Positive emotion words | LIWC | % words | +0.34 |
| Valence | Lexical Norms | Mean | +0.32 |
| Negative | Lexical Norms | Mean | -0.32 |
| Positive | Lexical Norms | Max | +0.26 |
| Difficulty of words | Complexity | Mean | +0.21 |
| Religious words | LIWC | % words | +0.20 |
| Gender ladenness | Lexical Norms | Min | -0.20 |
| Arousal | Lexical Norms | Min | -0.19 |
| 2nd vocal formant | Acoustics | Mean | +0.18 |
| Sad words | LIWC | % words | -0.16 |
| Coherence (latent semantic analysis) | Complexity | Stdv. | +0.16 |
| SMOG Index | Complexity | Mean | +0.14 |
| Harmonicity | Acoustics | Median | -0.13 |
| Assent | LIWC | % words | +0.12 |
LIWC, Linguistic Inquiry of Word Count toolkit; SMOG, Subjective Measure of Gobbligook Index.
Fig 3Covariance of speech features and clinical state over time.
(A) An example of clinical state (provider global assessment rating, black line) transitions within an individual patient over time compared to the individual’s highest performing linguistic feature (word count per speech sample, dotted grey line) for each call to the MCC system. (B) Increased correlations between clinical state and speech features over time highlighted through percent of maximal change of 8-period moving averages for provider rating (black line), word count (dotted grey line), and verbal pause percent (dotted light grey line) for the same patient and period.
Clinical state tracking of provider global assessment ratings using speech features.
| Concurrent assessment | 0.44 (p<0.05)* |
| Forecasting assessment | 0.33 (p<0.05)* |
| Concurrent assessment | 0.78 (p<0.05)* |
| Forecasting assessment | 0.62 (p>0.05) |
a Correlation assessed using Spearman’s rank-order coefficient
b p values calculated using 2-tailed t-test compared to baseline models
Number of observations reported as n.