| Literature DB >> 22879758 |
Yan Xu1, Yue Wang, Jiahua Liu, Zhuowen Tu, Jian-Tao Sun, Junichi Tsujii, Eric Chang.
Abstract
OBJECTIVE: To create a sentiment classification system for the Fifth i2b2/VA Challenge Track 2, which can identify thirteen subjective categories and two objective categories.Entities:
Keywords: sentiment analysis; spanning n-gram; suicide note; supervised approach; web data
Year: 2012 PMID: 22879758 PMCID: PMC3409493 DOI: 10.4137/BII.S8956
Source DB: PubMed Journal: Biomed Inform Insights ISSN: 1178-2226
Figure 1.System architecture diagram.
Suicide note emotions and similar LiveJournal moods.
| Anger | Annoyed, aggravated, angry, pissed off |
| Abuse | Embarrassed |
| Blame | Rejected, annoyed |
| Fear | Scared |
| Forgiveness | (No corresponding mood) |
| Guilt | Guilty |
| Happiness_peacefulness | Happy, cheerful, peaceful |
| Hopefulness | Hopeful, optimistic |
| Hopelessness | Depressed, crushed, frustrated |
| Love | Loved |
| Pride | Accomplished |
| Sorrow | Sad, gloomy |
| Thankfulness | Thankful, grateful |
Figure 2.Objective information generalization.
10-fold cross validation micro-averaged results using bag of 1–4 gram features as baselines (categories abuse, anger, blame and pride are not included to avoid False Positive).
| SVM | 0.7767 | 0.2772 | 0.4085 |
| SVM (tune threshold of each category) | 0.4779 | 0.5325 | |
| Naïve Bayes | 0.5481 | 0.4088 | 0.4683 |
| Boosting | 0.6497 | 0.3493 | 0.4543 |
Notes:
A multiclass boosting algorithm26 is used. Weak classifiers: decision trees (depth = 2); number of iterations: 500. The default parameters are used for baseline experiment.
Micro-averaged results for fifteen categories on test data (Evaluating the use of framework).
| 1–4 grams | 0.4911 | 0.5204 | 0.5053 |
| Dividing categories into three groups | 0.5305 | 0.5818 |
Micro-averaged results for eight subjective categories on test data (Evaluating spanning n-gram features and feature selection).
| 1–4 grams | 0.4993 | 0.5426 | 0.5201 |
| unigram + spanning n-grams, not selected | 0.5199 | 0.5469 | 0.5331 |
| unigram + spanning n-grams, selected | 0.5180 | 0.5815 | 0.5479 |
Micro-averaged results for objective categories on test data (Evaluating item/location normalization and eBay knowledge).
| Information | 0.2798 | 0.5865 | 0.3789 |
| Information, normalization w/o eBay | 0.3613 | 0.4135 | 0.3857 |
| Information, normalization w/ eBay | 0.3313 | 0.5288 | |
| Instructions | 0.6241 | 0.6649 | 0.6439 |
| Instructions, normalization w/o eBay | 0.6675 | 0.6832 | 0.6753 |
| Instructions, normalization w/ eBay | 0.6530 | 0.7094 |
Micro-averaged results for sentiment analysis in suicide notes.
| Abuse | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| Anger | 0.17 | 0.04 | 0.06 | 0.17 | 0.04 | 0.06 | 0.20 | 0.08 | 0.11 |
| Blame | 0.44 | 0.18 | 0.25 | 0.44 | 0.18 | 0.25 | 0.44 | 0.18 | 0.25 |
| Fear | 0.30 | 0.23 | 0.26 | 0.30 | 0.23 | 0.26 | 0.33 | 0.31 | 0.32 |
| Forgiveness | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| Guilt | 0.49 | 0.42 | 0.45 | 0.50 | 0.43 | 0.46 | 0.49 | 0.54 | 0.51 |
| Happiness_peacefulness | 1.00 | 0.50 | 0.67 | 1.00 | 0.50 | 0.67 | 1.00 | 0.69 | 0.81 |
| Hopefulness | 0.24 | 0.21 | 0.23 | 0.26 | 0.21 | 0.23 | 0.25 | 0.24 | 0.24 |
| Hopelessness | 0.52 | 0.66 | 0.58 | 0.54 | 0.66 | 0.59 | 0.64 | 0.66 | 0.65 |
| Information | 0.33 | 0.53 | 0.41 | 0.34 | 0.66 | 0.45 | 0.36 | 0.66 | 0.47 |
| Instructions | 0.65 | 0.71 | 0.68 | 0.65 | 0.73 | 0.69 | 0.69 | 0.72 | 0.71 |
| Love | 0.71 | 0.68 | 0.70 | 0.72 | 0.69 | 0.70 | 0.70 | 0.75 | 0.72 |
| Pride | 0.40 | 0.22 | 0.29 | 0.40 | 0.22 | 0.29 | 0.33 | 0.22 | 0.27 |
| Sorrow | 0.14 | 0.24 | 0.17 | 0.14 | 0.24 | 0.17 | 0.14 | 0.26 | 0.18 |
| Thankfulness | 0.45 | 0.84 | 0.59 | 0.45 | 0.84 | 0.59 | 0.46 | 0.89 | 0.61 |
| Micro-averaged | 0.53 | 0.58 | 0.53 | 0.60 | 0.56 | 0.62 | |||