Literature DB >> 33348764

Automated Classification of Online Sources for Infectious Disease Occurrences Using Machine-Learning-Based Natural Language Processing Approaches.

Mira Kim1, Kyunghee Chae1, Seungwoo Lee2,3, Hong-Jun Jang3, Sukil Kim1.   

Abstract

Collecting valid information from electronic sources to detect the potential outbreak of infectious disease is time-consuming and labor-intensive. The automated identification of relevant information using machine learning is necessary to respond to a potential disease outbreak. A total of 2864 documents were collected from various websites and subsequently manually categorized and labeled by two reviewers. Accurate labels for the training and test data were provided based on a reviewer consensus. Two machine learning algorithms-ConvNet and bidirectional long short-term memory (BiLSTM)-and two classification methods-DocClass and SenClass-were used for classifying the documents. The precision, recall, F1, accuracy, and area under the curve were measured to evaluate the performance of each model. ConvNet yielded higher average, min, and max accuracies (87.6%, 85.2%, and 91.1%, respectively) than BiLSTM with DocClass, while BiLSTM performed better than ConvNet with SenClass with average, min, and max accuracies of 92.8%, 92.6%, and 93.3%, respectively. The performance of BiLSTM with SenClass yielded an overall accuracy of 92.9% in classifying infectious disease occurrences. Machine learning had a compatible performance with a human expert given a particular text extraction system. This study suggests that analyzing information from the website using machine learning can achieve significant accuracies in the presence of abundant articles/documents.

Entities:  

Keywords:  classification; infectious disease; machine learning; online document; public health surveillance

Mesh:

Year:  2020        PMID: 33348764      PMCID: PMC7766498          DOI: 10.3390/ijerph17249467

Source DB:  PubMed          Journal:  Int J Environ Res Public Health        ISSN: 1660-4601            Impact factor:   3.390


  20 in total

1.  How can we improve global infectious disease surveillance and prevent the next outbreak?

Authors:  Sara Gorman
Journal:  Scand J Infect Dis       Date:  2013-09-04

2.  Developing an integrated epidemiologic approach to emerging infectious diseases.

Authors:  S S Morse; J M Hughes
Journal:  Epidemiol Rev       Date:  1996       Impact factor: 6.222

3.  Development of a global infectious disease activity database using natural language processing, machine learning, and human expertise.

Authors:  Joshua Feldman; Andrea Thomas-Bachli; Jack Forsyth; Zaki Hasnain Patel; Kamran Khan
Journal:  J Am Med Inform Assoc       Date:  2019-11-01       Impact factor: 4.497

4.  An exploratory study of a text classification framework for Internet-based surveillance of emerging epidemics.

Authors:  Manabu Torii; Lanlan Yin; Thang Nguyen; Chand T Mazumdar; Hongfang Liu; David M Hartley; Noele P Nelson
Journal:  Int J Med Inform       Date:  2010-12-04       Impact factor: 4.046

5.  An efficient machine learning approach for diagnosis of paraquat-poisoned patients.

Authors:  Lufeng Hu; Guangliang Hong; Jianshe Ma; Xianqin Wang; Huiling Chen
Journal:  Comput Biol Med       Date:  2015-02-12       Impact factor: 4.589

6.  A machine learning-based approach for predicting the outbreak of cardiovascular diseases in patients on dialysis.

Authors:  Sabrina Mezzatesta; Claudia Torino; Pasquale De Meo; Giacomo Fiumara; Antonio Vilasi
Journal:  Comput Methods Programs Biomed       Date:  2019-05-13       Impact factor: 5.428

7.  Development of an influenza virologic risk assessment tool.

Authors:  Susan C Trock; Stephen A Burke; Nancy J Cox
Journal:  Avian Dis       Date:  2012-12       Impact factor: 1.577

8.  Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants.

Authors:  Ahmed M Alaa; Thomas Bolton; Emanuele Di Angelantonio; James H F Rudd; Mihaela van der Schaar
Journal:  PLoS One       Date:  2019-05-15       Impact factor: 3.240

9.  Classification of Skin Disease using Ensemble Data Mining Techniques.

Authors:  Anurag Kumar Verma; Saurabh Pal; Surjeet Kumar
Journal:  Asian Pac J Cancer Prev       Date:  2019-06-01

10.  HealthMap: global infectious disease monitoring through automated classification and visualization of Internet media reports.

Authors:  Clark C Freifeld; Kenneth D Mandl; Ben Y Reis; John S Brownstein
Journal:  J Am Med Inform Assoc       Date:  2007-12-20       Impact factor: 4.497

View more
  3 in total

Review 1.  Machine and cognitive intelligence for human health: systematic review.

Authors:  Xieling Chen; Gary Cheng; Fu Lee Wang; Xiaohui Tao; Haoran Xie; Lingling Xu
Journal:  Brain Inform       Date:  2022-02-12

2.  Linguistic Pattern-Infused Dual-Channel Bidirectional Long Short-term Memory With Attention for Dengue Case Summary Generation From the Program for Monitoring Emerging Diseases-Mail Database: Algorithm Development Study.

Authors:  Yung-Chun Chang; Yu-Wen Chiu; Ting-Wu Chuang
Journal:  JMIR Public Health Surveill       Date:  2022-07-13

3.  Identification of the high-risk area for schistosomiasis transmission in China based on information value and machine learning: a newly data-driven modeling attempt.

Authors:  Yan-Feng Gong; Ling-Qian Zhu; Yin-Long Li; Li-Juan Zhang; Jing-Bo Xue; Shang Xia; Shan Lv; Jing Xu; Shi-Zhu Li
Journal:  Infect Dis Poverty       Date:  2021-06-27       Impact factor: 4.520

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.