| Literature DB >> 28630032 |
Michele Miller1, Tanvi Banerjee2,3, Roopteja Muppalla2,3, William Romine1, Amit Sheth2,3.
Abstract
BACKGROUND: In order to harness what people are tweeting about Zika, there needs to be a computational framework that leverages machine learning techniques to recognize relevant Zika tweets and, further, categorize these into disease-specific categories to address specific societal concerns related to the prevention, transmission, symptoms, and treatment of Zika virus.Entities:
Keywords: epidemiology; machine learning; social media; viruses
Year: 2017 PMID: 28630032 PMCID: PMC5495967 DOI: 10.2196/publichealth.7157
Source DB: PubMed Journal: JMIR Public Health Surveill ISSN: 2369-2960
Figure 1Block diagram of the pragmatic function-oriented content retrieval using a hierarchical supervised classification technique, followed by deeper analysis for characteristics of disease content.
Figure 2Polarity and proportion of tweets divided in the gender categories.
Figure 3Number of tweets in each disease category after classifying all tweets (1.2 million tweets) using the best classification model multinomial Naive Bayes (discussed in the Classification and Performance Using 10-fold Cross-Validation section).
Figure 4Number of tweets from the labeled dataset for each of the 4 categories of disease characteristics.
Different classifier performances for detecting relevant tweets using decision tree (J48), multinomial Naive Bayes (MNB), Bayesian networks (Bayes Net), sequential minimal optimization (SMO) using support vector machine (SVM), and bagging or bootstrapping (Bagging) techniques.
| Classifier | TPa | FPb | Precision | Recall | F1 score | AUCc |
| J48 | 0.821 | 0.390 | 0.812 | 0.821 | 0.815 | 0.784 |
| MNB (bayes) | 0.880 | 0.368 | 0.881 | 0.880 | 0.868 | 0.943 |
| Bayes Net | 0.832 | 0.479 | 0.821 | 0.832 | 0.812 | 0.837 |
| SMO | 0.895 | 0.252 | 0.892 | 0.895 | 0.892 | 0.822 |
| Bagging | 0.857 | 0.411 | 0.852 | 0.857 | 0.843 | 0.877 |
aTP: true positive.
bFP: false positive.
cAUC: area under the curve.
Different classifier performances for detecting the 4 disease categories within the relevant tweets using decision tree (J48), multinomial Naive Bayes (MNB), Bayesian networks (Bayes Net), sequential minimal optimization (SMO) using support vector machine (SVM), as well as bagging or bootstrapping (Bagging) techniques.
| Classifier | TPa | FPb | Precision | Recall | F1 score | AUCc |
| J48 | 0.694 | 0.122 | 0.702 | 0.694 | 0.695 | 0.838 |
| MNB | 0.784 | 0.084 | 0.787 | 0.784 | 0.785 | 0.940 |
| Bayes Net | 0.697 | 0.121 | 0.729 | 0.697 | 0.702 | 0.885 |
| SMO (SVM) | 0.775 | 0.088 | 0.780 | 0.775 | 0.777 | 0.877 |
| Bagging | 0.727 | 0.112 | 0.741 | 0.727 | 0.730 | 0.901 |
aTP: true positive.
bFP: false positive.
cAUC: area under the curve.
Precision, recall, and F-measure for each of the 4 disease characteristics.
| Category | Symptoms | Treatment | Transmission | Prevention | Average |
| precision | 0.98 | 0.97 | 0.86 | 0.94 | 0.94 |
| Recall | 0.81 | 0.97 | 0.88 | 0.83 | 0.87 |
| F1 score | 0.89 | 0.97 | 0.87 | 0.88 | 0.90 |
Figure 5Prevention, symptoms, transmission, and treatment perplexity measure plots.
Prevention, transmission, and treatment topic modeling results.
| Disease characteristic | Topic | Sample tweets for each topic |
| Prevention | (#1) Control | RTa @DrFriedenCDC: A2. The best way to prevent #Zika & other diseases spread by mosquitoes is to protect yourself from mosquito bites. #Reut |
| (#2) Money need | #healthy Congress has not yet acted on Obama’s $2 billion in emergency funding for Zika, submitted in February | |
| (#3) Prevention | RT @bmj_latest: Couples at risk from exposure to Zika virus should consider delaying pregnancy, says @CDCgov | |
| (#4) Bill | https://t.co/Ke12LOdypf Senate Approves $1.1 Billion In Funding To Fight The Zika Virus #NYCnowApp | |
| (#5) Research | Florida is among those at greatest risk for Zika. @FLGovScott’s sweeping abortion bill blocks scientists’ access to conduct research | |
| Transmission | (#1) Vectors (mosquitoes) | This map shows the Northeast is at risk for Zika mosquitoes this summer |
| (#2) Sexual | @user1 First Sexually Transmitted Case Of Zika Virus In U.S. Confirmed | |
| (#3) Infants | CDCb reports 157 cases of U.S. pregnant women infected with Zika virus. | |
| (#4) Spread | Zika strain from Americas outbreak spreads in Africa for first time: WHOc (Update) | |
| (#5) Sports | MLBd moves games from Puerto Rico due to Zika concerns....uh..what about the Olympics?? Can’t be good. | |
| Treatment | (#1) Lack of treatment | RT @DrFriedenCDC: Much is still unknown about #Zika and there is no current medicine for treatment or vaccine to prevent the virus. |
| (#2) Zika test | Rapid Zika Test Is Introduced by Researchers The test, done with a piece of paper that changes color if the virus... | |
| (#3) Vaccine development | Researchers discover structure of Zika virus, a key discovery in development of antiviral treatments and vaccines | |
| (#4) Blood test | Experimental blood test for Zika screening approved | |
| (#5) Test development | New mouse model leads way for #Zika drug, vaccine tests |
aRT: ReTweet.
bCDC: Center for Disease Control.
cWHO: World Health Organization.
dMLB: Major League Baseball.
Symptoms topic modeling results.
| Topic | Words | Tweets |
| (#1) Zika effects | infect, babies, mosquito, cause, microcephaly, symptom, pregnancy | RTa @USATODAYhealth: Zika affects babies even in later stages of pregnancy. Microcephaly seen in babies from moms infected in 6th month |
| (#2) Brain defects | brain, link, studies, microcephaly, baby, disorder, cause, damage, infect, fetal | Zika Virus May Cause Microcephaly by Hijacking Human Immune Molecule: Fetal brain model provides first clues on how Z... |
| (#3) Confirmed defects | defect, cause, birth, confirm, health, severe, link, official | Enough conspiracy theories; nature is nasty enough: U.S. health officials confirm Zika cause of severe birth defects |
| (#4) Scarier than thought | scarier, than, thought, us, official, health, CDCb, warn, learn, first | #breakingnews Zika Virus “Scarier Than We First Thought,” Warn US Health Officials |
| (#5) Initial reports | first, report, death, case, puerto, confirm, rico, cause, colombia, defect | Colombia Reports First Cases of Microcephaly Linked to Zika Virus—Sun Jan 09 15:13:20 EST |
aRT: ReTweet.
bCDC: Center for Disease Control.
Figure 6A 2-dimensional principal components plot of topics discussed pertaining to Zika symptoms.