| Literature DB >> 29135991 |
Lushi Chen1, Tao Gong2,3, Michal Kosinski4, David Stillwell5, Robert L Davidson6,7.
Abstract
Subjective well-being includes 'affect' and 'satisfaction with life' (SWL). This study proposes a unified approach to construct a profile of subjective well-being based on social media language in Facebook status updates. We apply sentiment analysis to generate users' affect scores, and train a random forest model to predict SWL using affect scores and other language features of the status updates. Results show that: the computer-selected features resemble the key predictors of SWL as identified in early studies; the machine-predicted SWL is moderately correlated with the self-reported SWL (r = 0.36, p < 0.01), indicating that language-based assessment can constitute valid SWL measures; the machine-assessed affect scores resemble those reported in a previous experimental study; and the machine-predicted subjective well-being profile can also reflect other psychological traits like depression (r = 0.24, p < 0.01). This study provides important insights for psychological prediction using multiple, machine-assessed components and longitudinal or dense psychological assessment using social media language.Entities:
Mesh:
Year: 2017 PMID: 29135991 PMCID: PMC5685571 DOI: 10.1371/journal.pone.0187278
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Satisfaction with life prediction model.
We use Elastic Net regression to select informative features among sentiment, LIWC and LDA generated topics for the random forest model. The model is trained to fit the self-reported SWL score.
Correlation matrix of affect features and self-reported SWL.
| 1. | 2. | 3. | 4. | 5. | |
| 1.self-reported SWL | |||||
| 2. sentiment | 0.21 | ||||
| 3. positive frequency | 0.08 | 0.68 | |||
| 4. negative frequency | -0.23 | -0.50 | 0.12 | ||
| 5. positive / negative | 0.16 | 0.68 | 0.45 | -0.48 |
All the results have p values less than 0.001.
Sentiment: mean sentiment score of a user; positive frequency: proportion of positive status among all status of a user; negative frequency: proportion of negative status among all status of a user; positive/negative: ratio between positive and negative frequency.
Correlations between the prediction performance of the random forest models using different features and self-reported SWL.
| Feature set | RMSE | ||
|---|---|---|---|
| Baseline (1) | 0.001 | 0.97 | 1.37 |
| LIWC (13) | 0.29 | 1.3e -15 | 1.32 |
| selected LDA (117) | 0.33 | < 2.2e -16 | 1.32 |
| selected LDA + sentiment (120) | 0.34 | < 2.2e -16 | 1.31 |
| selected LDA + selected LIWC + sentiment (133) | 0.36 | < 2.2e -16 | 1.30 |
The baseline model uses the median of the self-reported SWL with variation as feature. Root mean square error (RMSE) is relative to a range of SWL scores from the full dataset of 1.2 to 6.8. Numbers within brackets in the ‘feature set’ column are numbers of features in those sets.
Correlations between the random forest predicted and self-reported CES-D.
| Feature set | RMSE | ||
|---|---|---|---|
| Baseline (1) | -0.02 | 0.689 | 9.15 |
| sentiment (3) | 0.08 | 0.381 | 8.45 |
| self-reported SWL + sentiment (4) | 0.28 | 0.001 | 7.90 |
| machine-predicted SWL + sentiment (4) | 0.25 | 0.005 | 7.96 |
The baseline model uses the median of the self-reported SWL with variation as feature. Significance of the correlation (p value) and root mean squared error (RMSE) is also provided. Numbers within brackets in the ‘feature set’ column are numbers of features in those sets.
Fig 2Activity sentiment scores.
We compare the Facebook activity sentiment scores (FB activities (z-scores)) with the activity sentiment scores in the experience sampling study [11] (experience sampling (z-scores)). Since some activities were not included in the experience sampling study, the corresponding columns are empty.
Facebook activity sentiment.
| activities | FB sentiment ( | experience sampling ( |
|---|---|---|
| religion | 0.57 | |
| holiday | 0.39 | |
| talk to friend | 0.39 | 0.35 |
| chores | 0.30 | -0.21 |
| family | 0.27 | |
| meal | 0.29 | 0.19 |
| school | -0.03 | -0.21 |
| maths | -0.15 | -0.25 |