| Literature DB >> 28344978 |
Tim Althoff1, Kevin Clark1, Jure Leskovec1.
Abstract
Mental illness is one of the most pressing public health issues of our time. While counseling and psychotherapy can be effective treatments, our knowledge about how to conduct successful counseling conversations has been limited due to lack of large-scale data with labeled outcomes of the conversations. In this paper, we present a large-scale, quantitative study on the discourse of text-message-based counseling conversations. We develop a set of novel computational discourse analysis methods to measure how various linguistic aspects of conversations are correlated with conversation outcomes. Applying techniques such as sequence-based conversation models, language model comparisons, message clustering, and psycholinguistics-inspired word frequency analyses, we discover actionable conversation strategies that are associated with better conversation outcomes.Entities:
Year: 2016 PMID: 28344978 PMCID: PMC5361062
Source DB: PubMed Journal: Trans Assoc Comput Linguist ISSN: 2307-387X
Basic dataset statistics.
| Dataset statistics | |
|---|---|
| Conversations | 80,885 |
| Conversations with survey response | 15,555 (19.2%) |
| Messages | 3.2 million |
| Messages with survey response | 663,026 (20.6%) |
| Counselors | 408 |
| Messages per conversation | 42.6 |
| Words per message | 19.2 |
Rows marked with * are computed over conversations with survey responses.
Frequencies and success rates for the nine most common conversation issues (NA: Not available). On average, more and less successful counselors face the same distribution of issues.
| NA | Depressed | Relationship | Self harm | Family | Suicide | Stress | Anxiety | Other | |
|---|---|---|---|---|---|---|---|---|---|
| Success rate | 0.556 | 0.612 | 0.659 | 0.672 | 0.711 | 0.573 | 0.696 | 0.671 | 0.537 |
| Frequency | 0.200 | 0.200 | 0.089 | 0.074 | 0.071 | 0.063 | 0.041 | 0.039 | 0.035 |
| Frequency with more | 0.203 | 0.199 | 0.089 | 0.067 | 0.072 | 0.061 | 0.048 | 0.042 | 0.030 |
| Frequency with less | 0.223 | 0.208 | 0.087 | 0.070 | 0.067 | 0.056 | 0.030 | 0.032 | 0.028 |
Figure 1Differences in counselor message length (in #tokens) over the course of the conversation are larger between more and less successful counselors (blue circle/red square) than between positive and negative conversations (solid/dashed). Error bars in all plots correspond to bootstrapped 95% confidence intervals using the member bootstrapping technique from Ren et al. (2010).
Figure 2More successful counselors are more varied in their language across positive/negative conversations, suggesting they adapt more. All differences between more successful and less successful counselors except for the 0–20 bucket were found to be statistically significant (p < 0.05; bootstrap resampling test).
Figure 3More ambiguous situations (length of situation setter) are less likely to result in positive conversations.
Figure 4All counselors react to short, ambiguous messages by writing more (relative to the texter message) but more successful counselors do it more than less successful counselors.
Differences between more and less successful counselors (C; More S. and Less S.) in responses to nearly identical situation setters (Sec. 6.1) by the texter (T).
| More S. | Less S. | Test | |
|---|---|---|---|
| % conversations successful | 70.7 | 51.7 | |
| #messages in conversation | 57.0 | 46.7 | |
| Situation setter length (#tokens) | 12.1 | 10.7 | |
| C response length (#tokens) | 15.8 | 11.8 | |
| T response length (#tokens) | 20.4 | 18.8 | |
| % Cosine sim. C resp. to context | 11.9 | 14.8 | |
| % Cosine sim. T resp. to context | 7.6 | 7.3 | |
| % C resp. w check question | 12.6 | 4.1 | |
| % C resp. w suicide check | 13.5 | 10.3 | |
| % C resp. w thanks | 6.3 | 2.4 | |
| % C resp. w hedges | 41.4 | 36.8 | |
| % C resp. w surprise | 3.3 | 2.8 |
Last column contains significance levels of Wilcoxon Signed Rank Tests (*** p < 0.001, – p > 0.05).
Figure 5More successful counselors use less common/templated responses (after the texter first explains the situation). This suggests that they respond in a more creative way. There is no significant difference between positive and negative conversations.
Figure 6Our conversation model generates a particular conversation Ck by first generating a sequence of hidden states S0, S1,… according to a Markov model. Each state Si then generates a message as a bag of words Wi, 0, Wi, 1 … according a unigram language model WS.
Figure 7Allowed state transitions for the conversation model. Counselor and texter messages are produced by distinct states and conversations must progress through the stages in increasing order.
The top 5 words for counselors and texters with greatest increase in likelihood of appearing in each stage. The model successfully identifies interpretable stages consistent with counseling guidelines (qualitative interpretation based on stage assignment and model parameters; only words occurring more than five hundred times are shown).
| Stage | Interpretation | Top words for texter | Top words for counselor |
|---|---|---|---|
| 1 | Introductions | hi, hello, name, listen, hey | hi, name, hello, hey, brings |
| 2 | Problem introduction | dating, moved, date, liked, ended | gosh, terrible, hurtful, painful, ago |
| 3 | Problem exploration | knows, worry, burden, teacher, group | react, cares, considered, supportive, wants |
| 4 | Problem solving | write, writing, music, reading, play | hobbies, writing, activities, distract, music |
| 5 | Wrap up | goodnight, bye, thank, thanks, appreciate | goodnight, 247, anytime, luck, 24 |
Figure 8More successful counselors are quicker to get to know texter and issue (stage 2) and use more of their time in the “problem solving” phase (stage 4).
Figure 9A: Throughout the conversation there is a shift from talking about the past to future, where in positive conversations this shift is greater; B: Texters that talk more about others more often feel better after the conversation; C: More positive sentiment by the texter throughout the conversation is associated with successful conversations.
Figure 10Prediction accuracies vs. percent of the conversation seen by the model (without texter features).
Performance of nested models predicting conversation outcome given the first 80% of the conversation. In bold: full models with only counselor features and with additional texter features.
| Features | ROC AUC |
|---|---|
| Counselor unigrams only | 0.630 |
| Counselor unigrams and bigrams only | 0.638 |
| None | 0.5 |
| + hedges | 0.514 (+0.014) |
| + check questions | 0.546 (+0.032) |
| + similarity to last message | 0.553 (+0.007) |
| + duration of each stage | 0.561 (+0.008) |
| + sentiment | 0.590 (+0.029) |
| + message length | 0.596 (+0.006) |
| + stages feature conjunction | 0.606 (+0.010) |
| + counselor unigrams and bigrams | |
| + texter unigrams and bigrams |