| Literature DB >> 28611016 |
Son Doan1, Amanda Ritchart2, Nicholas Perry3, Juan D Chaparro1, Mike Conway4.
Abstract
BACKGROUND: Stress is a contributing factor to many major health problems in the United States, such as heart disease, depression, and autoimmune diseases. Relaxation is often recommended in mental health treatment as a frontline strategy to reduce stress, thereby improving health conditions. Twitter is a microblog platform that allows users to post their own personal messages (tweets), including their expressions about feelings and actions related to stress and stress management (eg, relaxing). While Twitter is increasingly used as a source of data for understanding mental health from a population perspective, the specific issue of stress-as manifested on Twitter-has not yet been the focus of any systematic study.Entities:
Keywords: Twitter; machine learning; natural language processing; relaxation; social media; stress
Year: 2017 PMID: 28611016 PMCID: PMC5487742 DOI: 10.2196/publichealth.5939
Source DB: PubMed Journal: JMIR Public Health Surveill ISSN: 2369-2960
Figure 1Schema used to classify stress tweets.
Figure 2Schema used to classify relaxation tweets.
Figure 3Description of dataset 1.
Figure 4Datasets and tasks used for machine learning.
Figure 5Distribution by theme of first-hand experience stress tweets in dataset 1.
Figure 6Distribution by topic of first-hand experience stress tweets in dataset 1.
Figure 7Distribution by topic of first-hand experience relaxation tweets in dataset 1.
Classification evaluation using 10-fold cross-validation on dataset 1.
| Classification | Machine learning algorithm | |||||||
| Naive Bayes | Support vector machine (linear kernel) | |||||||
| Acca (%) | Senb (%) | Specc (%) | PPVd (%) | Acc (%) | Sen (%) | Spec (%) | PPV (%) | |
| Stress vs nonstress | 78.64 | 91.97 | 65.30 | 72.69 | 81.66 | 92.73 | 70.61 | 76.07 |
| Relaxation vs nonrelaxation | 78.08 | 96.15 | 60.00 | 70.68 | 83.72 | 90.26 | 77.18 | 79.86 |
| First-hand vs nonfirst-hand experience stress | 87.58 | 95.53 | 67.89 | 88.14 | 85.61 | 90.64 | 73.16 | 89.32 |
| First-hand vs nonfirst-hand experience relaxation | 85.64 | 99.09 | 11.67 | 86.07 | 83.85 | 95.76 | 18.33 | 86.56 |
aAcc: accuracy.
bSen: sensitivity.
cSpec: specificity.
dPPV: positive predictive value.
Top 30 keywords ranked by information gain in stress and relaxation classification in dataset 1.
| Stress vs nonstress | First-hand stress vs nonstress | First-hand relaxation vs nonrelaxation | Relaxation vs nonrelaxation |
| stressed | http | rt | rt |
| stress | rt | relaxing | relaxing |
| rt | stressed | relaxin | relaxin |
| mistress | stressing | sorelaxing | sorelaxing |
| stressful | stressful | relaxed | relaxed |
| stressing | mistress | work | time |
| http | stressingout | night | work |
| stressingout | sostressed | time | night |
| cashnewvideo | stressin | day | day |
| camerondallas | cashnewvideo | shower | cashnewvideo |
| burdenofstress | school | cashnewvideo | relax |
| tiger | ly | camerondallas | shower |
| stressin | stress | finally | camerondallas |
| sostressed | camerondallas | bath | relaxa |
| day | day | relax | video |
| nashgrier | love | listening | finally |
| distressed | sostressful | beach | bath |
| school | college | relaxa | home |
| anxiety | packing | video | vacation |
| life | life | home | listening |
| busy | vacation | beach | |
| learn | tiger | pool | nashgrier |
| woods | hours | sitting | relaxar |
| bitch | big | enjoying | pool |
| hours | nashgrier | watching | enjoying |
| packing | distressed | rain | rain |
| hate | give | long | |
| haha | long | nashgrier | sitting |
| college | weeks | long | watching |
| love | figure | bed | nice |
Number of tweets remaining after automatic classification.
| Cities | Stress rank 2011 (2014)a | No. of tweets | No. of tweets containing “relax” | No. of tweets containing “stress” | No. of relaxation tweets | No. of stress tweets | No. of relaxation tweets (first-hand) | No. of stress tweets (first-hand) |
| Los Angeles | 1 (3) | 6,627,969 | 5061 | 7925 | 3216 | 5914 | 2788 | 2386 |
| New York | 2 (1) | 8,229,442 | 6992 | 11,789 | 4412 | 8245 | 3766 | 3278 |
| San Diego | 5 (38) | 2,908,774 | 2178 | 3769 | 1449 | 2830 | 1275 | 1193 |
| San Francisco | 7 (39) | 4,372,966 | 2554 | 4558 | 1682 | 3384 | 1471 | 1389 |
aStress ranking is based on 2011 Forbes [40] and 2014 CNN studies [41]. Statistical tests between cities showed there are differences between cities (P<.001), except San Diego and New York (stress: P=.18, relaxation: P=.02). P values of relaxation and stress tweets between San Diego and Los Angeles are .41 and <.001, respectively. Ranks based on stress tweets are New York=San Diego, Los Angeles, and San Francisco.
Classification evaluation using a random sample of 200 tweets (100 containing the keyword “stress” and 100 containing the keyword “relax”) from New York in dataset 2.
| Classification | SVM (linear kernel) | |||
| Acca (%) | Senb (%) | Specc (%) | PPVd (%) | |
| Stress vs nonstress | 75.0 | 76.7 | 70.4 | 87.5 |
| Relaxation vs nonrelaxation | 66.0 | 67.4 | 57.1 | 90.6 |
| First-hand vs nonfirst-hand experience stress | 68.0 | 44.0 | 92.0 | 84.6 |
| First-hand vs nonfirst-hand experience relaxation | 92.0 | 87.5 | 100.0 | 100.0 |
aAcc: accuracy.
bSen: sensitivity.
cSpec: specificity.
dPPV: positive predictive value.
Figure 8Description of manual annotation of 100 random tweets containing the keywords “stress” and “relax” from dataset 2.
Figure 9Proportion of relaxation and stress tweets by city in dataset 2.
Figure 10Stress theme distribution by each of the 4 cities in dataset 2. There are no significant differences between cities (P>.05). Neg: negative; Pos: positive; S: symptoms; T: topics.
Figure 11Relaxation theme distribution by each of the 4 cities in dataset 2. There are significant differences between New York and the other cities in the topics of nature and water.
P values of pairwise comparisons of the proportion of stress and relaxation tweets between the 4 studied cities.
| Cities | Los Angeles | New York | San Francisco | |
| Stress | <.001 | .18 | <.001 | |
| Relaxation | .41 | .02 | <.001 | |
| Stress | <.001 | <.001 | N/Aa | |
| Relaxation | <.001 | <.001 | N/A | |
| Stress | <.001 | N/A | <.001 | |
| Relaxation | <.001 | N/A | <.001 | |
aN/A: not applicable.