| Literature DB >> 35682341 |
Sanguk Lee1, Siyuan Ma1, Jingbo Meng1, Jie Zhuang2, Tai-Quan Peng1.
Abstract
Despite the popularity and efficiency of dictionary-based sentiment analysis (DSA) for public health research, limited empirical evidence has been produced about the validity of DSA and potential harms to the validity of DSA. A random sample of a second-hand Ebola tweet dataset was used to evaluate the validity of DSA compared to the manual coding approach and examine the influences of textual features on the validity of DSA. The results revealed substantial inconsistency between DSA and the manual coding approach. The presence of certain textual features such as negation can partially account for the inconsistency between DSA and manual coding. The findings imply that scholars should be careful and critical about findings in disease-related public health research that use DSA. Certain textual features should be more carefully addressed in DSA.Entities:
Keywords: ANEW; LIWC; SentiWordNet; infectious diseases; sentiment analysis; validity
Mesh:
Year: 2022 PMID: 35682341 PMCID: PMC9180278 DOI: 10.3390/ijerph19116759
Source DB: PubMed Journal: Int J Environ Res Public Health ISSN: 1660-4601 Impact factor: 4.614
Explanations and examples of textual features.
| Textual Features | Definitions | Reasonings and Examples | References |
|---|---|---|---|
|
| |||
| Embedded Hashtag | A hashtag that grammatically structures a sentence. | An embedded hashtag can threaten the validity of DSA because a hashtag structurally embedded in tweets can be meaningful, and widely used but cannot be generally captured by DSA. | N.A. |
| Irrealis | A function indicating that a certain situation or action is unknown to happen. | It is challenging to estimate the accurate sentiment of a text containing irrealis because irrealis can change the meaning of sentiment-bearing words in a subtle manner. Irrealis’ markers include modal verbs (e.g., would, could, would have), conditional markers (e.g., if), negative polarity items (e.g., any, anything), certain verbs (e.g., expect, doubt, assume), and questions. | [ |
| Sarcasm | A sarcastic statement is defined as one where the opposite meaning is intended. | Sarcasm completely shifts the orientation of sentiment by using the opposite meaning of words given a context. | [ |
| Negation | Negations are terms that reverse the sentiment of a certain word. | Negations change the orientation of a sentence from positive to negative or negative to positive (e.g., no, not, rather, never, none, nobody, no one, nothing, neither, nor, nowhere, without). | [ |
| Intensifier | Intensifiers are terms that intensify the degree of the expressed sentiment. | Intensifiers change the sentiment of a sentence by intensifying the strength of sentiment (e.g., very, really, extraordinarily, huge, total). | [ |
| Diminisher | Diminishers are terms that decrease the degree of the expressed sentiment. | Diminishers change the sentiment of a sentence by decreasing the strength of sentiment (e.g., slightly; somewhat; minor). (e.g., I’m a little worried about Ebola.) | [ |
|
| |||
| Unconfirmed typo | A misspelled word. | A misspelled word may hold sentiments but is not generally capturable by DSA. | [ |
| Lengthened word | A lengthened word. | A lengthened word is difficult to be captured through DSA due to its unstructured format, although it may contain stronger sentiment compared with an ordinary format word. | [ |
| Irregularly capitalized word | A word that is capitalized in an uncommon way. | An irregularly capitalized word may contain stronger sentiment than a word in its ordinary format but is not generally capturable by DSA. | [ |
| Abbreviation | A shortened form of a word. | An abbreviation may contain a sentiment but is generally ignored by DSA. | [ |
| Acronym | A shortened form of a phrase that consists of the initials of each word. | An acronym may contain a sentiment but is generally ignored by DSA. | [ |
Validity evaluation in comparison with manual coding results.
| LIWC | ANEW | SWN | orgSWN | adSWN | ||
|---|---|---|---|---|---|---|
| F1 | F1 | F1 | F1 | F1 | Mean | |
| Neg | 0.34 | 0.20 | 0.35 | 0.24 | 0.31 | 0.29 |
| Neu | 0.70 | 0.47 | 0.01 | 0.51 | 0.17 | 0.37 |
| Pos | 0.30 | 0.13 | 0.19 | 0.15 | 0.13 | 0.18 |
| Macro Average | 0.45 | 0.27 | 0.18 | 0.30 | 0.21 | 0.28 |
| Accuracy (%) | 56.84 | 32.87 | 19.22 | 37.46 | 21.25 | 33.53 |
| Tweets ( | 7421 | 7797 | 7790 | 7319 | 7175 | 7500 |
Note: The number of tweets differs among applications due to the exclusion of tweets classified as mixed sentiment.
Sentiment classification comparison among DSA.
| LIWC | ANEW | SWN | orgSWN | Averaged Matched Cases | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Neg | Neu | Pos | Mix | Neg | Neu | Pos | Mix | Neg | Neu | Pos | Mix | Neg | Neu | Pos | Mix | |||
| LIWC | Neg | 41.72 | ||||||||||||||||
| Neu | ||||||||||||||||||
| Pos | ||||||||||||||||||
| Mix | ||||||||||||||||||
| ANEW | Neg | 11.22 | 8.69 | 1.54 | 1.38 | 34.40 | ||||||||||||
| Neu | 6.80 | 21.67 | 6.10 | 1.24 | ||||||||||||||
| Pos | 8.62 | 20.66 | 9.85 | 2.21 | ||||||||||||||
| Mix | 0.00 | 0.01 | 0.00 | 0.01 | ||||||||||||||
| SWN | Neg | 18.58 | 36.02 | 11.45 | 2.99 | 16.16 | 24.89 | 27.97 | 0.03 | 33.72 | ||||||||
| Neu | 0.08 | 0.22 | 0.08 | 0.01 | 0.08 | 0.17 | 0.14 | 0.00 | ||||||||||
| Pos | 7.98 | 14.71 | 5.95 | 1.83 | 6.59 | 10.71 | 13.17 | 0.00 | ||||||||||
| Mix | 0.00 | 0.09 | 0.01 | 0.01 | 0.01 | 0.05 | 0.05 | 0.00 | ||||||||||
| orgSWN | Neg | 10.68 | 15.28 | 3.05 | 1.40 | 10.62 | 8.48 | 11.31 | 0.01 | 21.37 | 0.06 | 8.96 | 0.01 | 31.36 | ||||
| Neu | 9.08 | 20.82 | 4.62 | 1.26 | 6.04 | 16.25 | 13.48 | 0.01 | 24.73 | 0.19 | 10.76 | 0.09 | ||||||
| Pos | 5.28 | 11.92 | 8.51 | 1.94 | 4.80 | 8.83 | 14.03 | 0.00 | 18.78 | 0.12 | 8.74 | 0.01 | ||||||
| Mix | 1.59 | 3.00 | 1.31 | 0.26 | 1.38 | 2.26 | 2.51 | 0.00 | 4.14 | 0.01 | 2.00 | 0.00 | ||||||
| adSWN | Neg | 13.51 | 22.49 | 6.82 | 2.33 | 11.19 | 16.98 | 16.99 | 0.00 | 31.18 | 0.18 | 13.73 | 0.06 | 21.37 | 0.06 | 8.96 | 0.01 | 30.99 |
| Neu | 1.56 | 6.14 | 0.91 | 0.21 | 1.77 | 4.50 | 2.55 | 0.00 | 6.23 | 0.08 | 2.50 | 0.01 | 24.73 | 0.19 | 10.76 | 0.09 | ||
| Pos | 9.51 | 18.02 | 8.57 | 1.92 | 8.00 | 11.67 | 18.32 | 0.03 | 26.18 | 0.13 | 11.68 | 0.03 | 18.78 | 0.12 | 8.74 | 0.01 | ||
| Mix | 2.04 | 4.39 | 1.19 | 0.38 | 1.87 | 2.67 | 3.46 | 0.00 | 5.44 | 0.00 | 2.55 | 0.01 | 4.14 | 0.01 | 2.00 | 0.00 | ||
| Average of Total Matched Cases | 34.44 | |||||||||||||||||
Note: Values are expressed in percentage, and values in a diagonal of a matrix between a pair of DSAs represent the percentage of consistent sentiment classification cases. For instance, both LIWC and ANEW classified 11.22% of tweets as negative, 21.67% of tweets as neutral, 9.85% of tweets as positive, and 0.01% as mixed sentiments. The sum of these values in the diagonal indicates the proportion of matched cases between LIWC and ANEW (42.75%). Averaging the proportion of matched cases between LIWC and other DSA is represented as averaged matched cases (41.72%).
The results of binary logistic regression: influences of textual features on inconsistency.
| Textual Features (IVs) | Inconsistency (DV) | ||||
|---|---|---|---|---|---|
| LIWC | ANEW | SWN | orgSWN | adSWN | |
| Intercept | −0.42 *** | 0.73 *** | 1.83 *** | 0.49 *** | 1.66 *** |
|
| |||||
| Embedded hashtags | −0.08 | −0.16 | −0.08 | 0.13 | −0.29 * |
| Irrealis | 0.47 * | 0.21 | −1.06 *** | −0.11 | −0.37 |
| Sarcasm | 1.09 | 0.94 | −0.32 | −0.34 | −1.69 * |
| Negations | 0.77 *** | 0.09 | −1.17 *** | 0.42 ** | −0.60 *** |
| Intensifiers | 0.21 | −0.08 | −0.30 | 0.35 * | −0.28 |
| Diminishers | 0.33 | −0.25 | −0.23 | 0.85 | −0.33 |
|
| |||||
| Unconfirmed typos | 0.62 | 0.30 | −1.55 *** | 0.24 | −0.95 ** |
| Lengthened words | 0.95 | 0.93 | −0.72 | 0.13 | 0.69 |
| Irregularly capitalized words | 0.45 | 0.83 * | −1.08 *** | 0.29 | −0.60 * |
| Abbreviations | 0.46 * | 0.54 * | −0.78 ** | 0.19 | −0.38 |
| Acronyms | 1.00 * | 0.52 | −1.14 ** | 1.26 * | −0.78 |
Note: * p < 0.05, ** p < 0.01, *** p < 0.001; in DV, consistent condition = 0, inconsistent condition = 1; the number of tweets that include each of the textual features are as follows: embedded hashtags (n = 429), irrealis (n = 429), sarcasm (n = 7), negation (n = 248), intensifiers (n = 260), diminishers (n = 11), unconfirmed typos (n = 35), lengthened words (n = 7), irregularly capitalized words (n = 71), abbreviations (n = 88), acronyms (n = 30); the total sample size is 1969.