Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 An exploratory study of a text classification framework for Internet-based surveillance of emerging epidemics.

Literature DB >> 21134784

An exploratory study of a text classification framework for Internet-based surveillance of emerging epidemics.

Manabu Torii¹, Lanlan Yin, Thang Nguyen, Chand T Mazumdar, Hongfang Liu, David M Hartley, Noele P Nelson.

Abstract

PURPOSE: Early detection of infectious disease outbreaks is crucial to protecting the public health of a society. Online news articles provide timely information on disease outbreaks worldwide. In this study, we investigated automated detection of articles relevant to disease outbreaks using machine learning classifiers. In a real-life setting, it is expensive to prepare a training data set for classifiers, which usually consists of manually labeled relevant and irrelevant articles. To mitigate this challenge, we examined the use of randomly sampled unlabeled articles as well as labeled relevant articles.
METHODS: Naïve Bayes and Support Vector Machine (SVM) classifiers were trained on 149 relevant and 149 or more randomly sampled unlabeled articles. Diverse classifiers were trained by varying the number of sampled unlabeled articles and also the number of word features. The trained classifiers were applied to 15 thousand articles published over 15 days. Top-ranked articles from each classifier were pooled and the resulting set of 1337 articles was reviewed by an expert analyst to evaluate the classifiers.
RESULTS: Daily averages of areas under ROC curves (AUCs) over the 15-day evaluation period were 0.841 and 0.836, respectively, for the naïve Bayes and SVM classifier. We referenced a database of disease outbreak reports to confirm that this evaluation data set resulted from the pooling method indeed covered incidents recorded in the database during the evaluation period.
CONCLUSIONS: The proposed text classification framework utilizing randomly sampled unlabeled articles can facilitate a cost-effective approach to training machine learning classifiers in a real-life Internet-based biosurveillance project. We plan to examine this framework further using larger data sets and using articles in non-English languages.

Entities: Chemical Disease Gene Species

Mesh：

Year: 2010 PMID： 21134784 PMCID： PMC3904285 DOI： 10.1016/j.ijmedinf.2010.10.015

Source DB: PubMed Journal: Int J Med Inform ISSN： 1386-5056 Impact factor: 4.046

13 in total

1. Estimating the support of a high-dimensional distribution.

Authors: B Schölkopf; J C Platt; J Shawe-Taylor; A J Smola; R C Williamson
Journal: Neural Comput Date: 2001-07 Impact factor: 2.026

2. Event-based biosurveillance of respiratory disease in Mexico, 2007-2009: connection to the 2009 influenza A(H1N1) pandemic?

Authors: N P Nelson; J S Brownstein; D M Hartley
Journal: Euro Surveill Date: 2010-07-29

3. Document classification for mining host pathogen protein-protein interactions.

Authors: Lanlan Yin; Guixian Xu; Manabu Torii; Zhendong Niu; Jose M Maisog; Cathy Wu; Zhangzhi Hu; Hongfang Liu
Journal: Artif Intell Med Date: 2010-05-15 Impact factor: 5.326

4. The surveillance of communicable diseases in the European Union--a long-term strategy (2008-2013).

Authors: A Amato-Gauci; A Ammon
Journal: Euro Surveill Date: 2008-06-26

5. PreBIND and Textomy--mining the biomedical literature for protein-protein interactions using a support vector machine.

Authors: Ian Donaldson; Joel Martin; Berry de Bruijn; Cheryl Wolting; Vicki Lay; Brigitte Tuekam; Shudong Zhang; Berivan Baskin; Gary D Bader; Katerina Michalickova; Tony Pawson; Christopher W V Hogue
Journal: BMC Bioinformatics Date: 2003-03-27 Impact factor: 3.169

6. Landscape of international event-based biosurveillance.

Authors: Dm Hartley; Np Nelson; R Walters; R Arthur; R Yangarber; L Madoff; Jp Linge; A Mawudeku; N Collier; Js Brownstein; G Thinus; N Lightfoot
Journal: Emerg Health Threats J Date: 2010-02-19

Review 7. Use of unstructured event-based reports for global infectious disease surveillance.

Authors: Mikaela Keller; Michael Blench; Herman Tolentino; Clark C Freifeld; Kenneth D Mandl; Abla Mawudeku; Gunther Eysenbach; John S Brownstein
Journal: Emerg Infect Dis Date: 2009-05 Impact factor: 6.883

8. Surveillance Sans Frontières: Internet-based emerging infectious disease intelligence and the HealthMap project.

Authors: John S Brownstein; Clark C Freifeld; Ben Y Reis; Kenneth D Mandl
Journal: PLoS Med Date: 2008-07-08 Impact factor: 11.069

9. A heuristic indication and warning staging model for detection and assessment of biological events.

Authors: James M Wilson; Marat G Polyak; Jane W Blake; Jeff Collmann
Journal: J Am Med Inform Assoc Date: 2007-12-20 Impact factor: 4.497

10. Fever detection from free-text clinical records for biosurveillance.

Authors: Wendy W Chapman; John N Dowling; Michael M Wagner
Journal: J Biomed Inform Date: 2004-04 Impact factor: 6.317

13 in total

1. International society for disease surveillance conference 2011: building the future of public health surveillance.

Authors: Daniel B Neill; Karl A Soetebier
Journal: Emerg Health Threats J Date: 2011-12-06

Review 2. Uncovering text mining: a survey of current work on web-based epidemic intelligence.

Authors: Nigel Collier
Journal: Glob Public Health Date: 2012-07-11

Review 3. A review of evaluations of electronic event-based biosurveillance systems.

Authors: Kimberly N Gajewski; Amy E Peterson; Rohit A Chitale; Julie A Pavlin; Kevin L Russell; Jean-Paul Chretien
Journal: PLoS One Date: 2014-10-20 Impact factor: 3.240

4. Coughing, sneezing, and aching online: Twitter and the volume of influenza-like illness in a pediatric hospital.

Authors: David M Hartley; Courtney M Giannini; Stephanie Wilson; Ophir Frieder; Peter A Margolis; Uma R Kotagal; Denise L White; Beverly L Connelly; Derek S Wheeler; Dawit G Tadesse; Maurizio Macaluso
Journal: PLoS One Date: 2017-07-28 Impact factor: 3.240

5. Web monitoring of emerging animal infectious diseases integrated in the French Animal Health Epidemic Intelligence System.

Authors: Elena Arsevska; Sarah Valentin; Julien Rabatel; Jocelyn de Goër de Hervé; Sylvain Falala; Renaud Lancelot; Mathieu Roche
Journal: PLoS One Date: 2018-08-03 Impact factor: 3.240

Review 6. Global mapping of infectious disease.

Authors: Simon I Hay; Katherine E Battle; David M Pigott; David L Smith; Catherine L Moyes; Samir Bhatt; John S Brownstein; Nigel Collier; Monica F Myers; Dylan B George; Peter W Gething
Journal: Philos Trans R Soc Lond B Biol Sci Date: 2013-02-04 Impact factor: 6.237

7. Evaluation of epidemic intelligence systems integrated in the early alerting and reporting project for the detection of A/H5N1 influenza events.

Authors: Philippe Barboza; Laetitia Vaillant; Abla Mawudeku; Noele P Nelson; David M Hartley; Lawrence C Madoff; Jens P Linge; Nigel Collier; John S Brownstein; Roman Yangarber; Pascal Astagneau
Journal: PLoS One Date: 2013-03-05 Impact factor: 3.240