| Literature DB >> 27410031 |
Eric M Clark1,2,3,4, Chris A Jones4,5,6, Jake Ryland Williams2,3,7, Allison N Kurti5, Mitchell Craig Norotsky4, Christopher M Danforth1,2,3, Peter Sheridan Dodds1,2,3.
Abstract
BACKGROUND: Twitter has become the "wild-west" of marketing and promotional strategies for advertisement agencies. Electronic cigarettes have been heavily marketed across Twitter feeds, offering discounts, "kid-friendly" flavors, algorithmically generated false testimonials, and free samples.Entities:
Mesh:
Year: 2016 PMID: 27410031 PMCID: PMC4943591 DOI: 10.1371/journal.pone.0157304
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Electronic Cigarette Tweet Category Counts and Twitter Account Classification.
| Year | Tweet Categorization | Account Classification | |||||
|---|---|---|---|---|---|---|---|
| Total | Automated | Organic | Discarded | Automated | Organic | N/A | |
| 2012 | 107,918 | 85,546 | 13,492 | 8,880 | 12,715 | 12,052 | 19,512 |
| 2013 | 426,306 | 339,111 | 76,037 | 11,158 | 64,874 | 59,376 | 120,142 |
| 2014 | 316,424 | 234,972 | 68,698 | 12,754 | 54,033 | 63,289 | 48,528 |
*Accounts with less than 25 tweets were not classified.
Fig 1Tweets from a random sample of 500 organic classified and 500 automated classified accounts were hand coded to gauge the accuracy of the detection algorithm.
The feature set of each sampled individual is plotted in three dimensions. The traced box indicate the organic feature cutoff. True Positives (red) are correctly identified automatons, True Negatives (green) are correctly identified Humans, False Negatives (blue) are automatons classified as humans and False Positives (orange) are humans classified as automatons.
Fig 2Left: Binned User E-cigarette Keyword Tweet Distribution (2012-2014). Right: 2013 Automated Tweet Rank-Frequency Word Cloud. High frequency stop words (‘of’, ‘the’, etc.) are removed from the rank-frequency word distribution.
Automated Tweet Subcategory Counts.
| Subcategory | Count | Percentage | Impressions | Relevance | Year |
|---|---|---|---|---|---|
| 53,471 | 62.51% | 59.74M | 88.4% | ‘12 | |
| 283,677 | 83.65% | 195.25M | ‘13 | ||
| 149,333 | 63.55% | 951.03M | ‘14 | ||
| 6,392 | 7.47% | 8.59M | 90.8% | ‘12 | |
| 6,599 | 1.95% | 25.64M | ‘13 | ||
| 8,386 | 3.57% | 42.72M | ‘14 | ||
| 26,596 | 31.09% | 27.02M | 89.8% | ‘12 | |
| 112,720 | 33.24% | 38.21M | ‘13 | ||
| 37,735 | 16.06% | 160.49M | ‘14 | ||
| 1,685 | 1.97% | 2.24M | 81% | ‘12 | |
| 2,715 | 0.80% | 4.79M | ‘13 | ||
| 6,133 | 2.61% | 17.51M | ‘14 |
*Relevant percentage of 500 randomly sampled tweets
Fig 3Categorical Tweet Word-shift Graphs: On the left, Organic Tweets from 2013 are the reference distribution to compare sentiments of Organic Tweets made in 2014 where we see a negative shift in the calculated average word happiness.
Due to tweets tagged #EUEcig Ban, January 2014 and December 2013 are omitted. The computed average happiness (havg) decreases from 5.82 to 5.77 due to both an increase in the negative words ‘tobacco’, ‘drug’, ‘ban’, ‘poison’, and a decrease in the positive words ‘love’, ‘like’, ‘haha’, ‘cool’ among others. On the right, Organic Tweets from 2013 are the reference distribution to compare Automated Tweets from 2013. The words ‘free’ and ‘trial’ are excluded from the graph, since their high frequency and happiness scores distorts the image. With these key words included the the automated tweet havg increases from 6.17 to 6.59.