Ramakanth Kavuluru1, A K M Sabbir2. 1. Division of Biomedical Informatics, Department of Internal Medicine, University of Kentucky, 230E MDS Building, 725 Rose Street, Lexington, KY 40536, USA; Department of Computer Science, University of Kentucky, David Marksbury Building, 329 Rose Street, Lexington, KY 40506, USA. Electronic address: ramakanth.kavuluru@uky.edu. 2. Department of Computer Science, University of Kentucky, David Marksbury Building, 329 Rose Street, Lexington, KY 40506, USA. Electronic address: akm.sabbir@uky.edu.
Abstract
BACKGROUND: Electronic cigarettes (e-cigarettes or e-cigs) are a popular emerging tobacco product. Because e-cigs do not generate toxic tobacco combustion products that result from smoking regular cigarettes, they are sometimes perceived and promoted as a less harmful alternative to smoking and also as means to quit smoking. However, the safety of e-cigs and their efficacy in supporting smoking cessation is yet to be determined. Importantly, the federal drug administration (FDA) currently does not regulate e-cigs and as such their manufacturing, marketing, and sale is not subject to the rules that apply to traditional cigarettes. A number of manufacturers, advocates, and e-cig users are actively promoting e-cigs on Twitter. OBJECTIVE: We develop a high accuracy supervised predictive model to automatically identify e-cig "proponents" on Twitter and analyze the quantitative variation of their tweeting behavior along popular themes when compared with other Twitter users (or tweeters). METHODS: Using a dataset of 1000 independently annotated Twitter profiles by two different annotators, we employed a variety of textual features from latest tweet content and tweeter profile biography to build predictive models to automatically identify proponent tweeters. We used a set of manually curated key phrases to analyze e-cig proponent tweets from a corpus of over one million e-cig tweets along well known e-cig themes and compared the results with those generated by regular tweeters. RESULTS: Our model identifies e-cig proponents with 97% precision, 86% recall, 91% F-score, and 96% overall accuracy, with tight 95% confidence intervals. We find that as opposed to regular tweeters that form over 90% of the dataset, e-cig proponents are a much smaller subset but tweet two to five times more than regular tweeters. Proponents also disproportionately (one to two orders of magnitude more) highlight e-cig flavors, their smoke-free and potential harm reduction aspects, and their claimed use in smoking cessation. CONCLUSIONS: Given FDA is currently in the process of proposing meaningful regulation, we believe our work demonstrates the strong potential of informatics approaches, specifically machine learning, for automated e-cig surveillance on Twitter.
BACKGROUND: Electronic cigarettes (e-cigarettes or e-cigs) are a popular emerging tobacco product. Because e-cigs do not generate toxic tobacco combustion products that result from smoking regular cigarettes, they are sometimes perceived and promoted as a less harmful alternative to smoking and also as means to quit smoking. However, the safety of e-cigs and their efficacy in supporting smoking cessation is yet to be determined. Importantly, the federal drug administration (FDA) currently does not regulate e-cigs and as such their manufacturing, marketing, and sale is not subject to the rules that apply to traditional cigarettes. A number of manufacturers, advocates, and e-cig users are actively promoting e-cigs on Twitter. OBJECTIVE: We develop a high accuracy supervised predictive model to automatically identify e-cig "proponents" on Twitter and analyze the quantitative variation of their tweeting behavior along popular themes when compared with other Twitter users (or tweeters). METHODS: Using a dataset of 1000 independently annotated Twitter profiles by two different annotators, we employed a variety of textual features from latest tweet content and tweeter profile biography to build predictive models to automatically identify proponent tweeters. We used a set of manually curated key phrases to analyze e-cig proponent tweets from a corpus of over one million e-cig tweets along well known e-cig themes and compared the results with those generated by regular tweeters. RESULTS: Our model identifies e-cig proponents with 97% precision, 86% recall, 91% F-score, and 96% overall accuracy, with tight 95% confidence intervals. We find that as opposed to regular tweeters that form over 90% of the dataset, e-cig proponents are a much smaller subset but tweet two to five times more than regular tweeters. Proponents also disproportionately (one to two orders of magnitude more) highlight e-cig flavors, their smoke-free and potential harm reduction aspects, and their claimed use in smoking cessation. CONCLUSIONS: Given FDA is currently in the process of proposing meaningful regulation, we believe our work demonstrates the strong potential of informatics approaches, specifically machine learning, for automated e-cig surveillance on Twitter.
Authors: Jean-François Etter; Chris Bullen; Andreas D Flouris; Murray Laugesen; Thomas Eissenberg Journal: Tob Control Date: 2011-03-17 Impact factor: 7.552
Authors: Andrea C King; Lia J Smith; Patrick J McNamara; Alicia K Matthews; Daniel J Fridberg Journal: Tob Control Date: 2014-05-21 Impact factor: 7.552
Authors: Isabel M Kloumann; Christopher M Danforth; Kameron Decker Harris; Catherine A Bliss; Peter Sheridan Dodds Journal: PLoS One Date: 2012-01-11 Impact factor: 3.240
Authors: Marie-Claude Tremblay; Pierre Pluye; Genevieve Gore; Vera Granikov; Kristian B Filion; Mark J Eisenberg Journal: BMC Med Date: 2015-06-03 Impact factor: 8.775
Authors: Kwanho Kim; Laura A Gibson; Sharon Williams; Yoonsang Kim; Steven Binns; Sherry L Emery; Robert C Hornik Journal: Nicotine Tob Res Date: 2020-10-08 Impact factor: 4.244
Authors: Elizabeth A Vandewater; Stephanie L Clendennen; Emily T Hébert; Galya Bigman; Christian D Jackson; Anna V Wilkinson; Cheryl L Perry Journal: Tob Regul Sci Date: 2018-03
Authors: Allison J Lazard; Adam J Saffer; Gary B Wilcox; Arnold DongWoo Chung; Michael S Mackert; Jay M Bernhardt Journal: JMIR Public Health Surveill Date: 2016-12-12