Literature DB >> 28034788

An unsupervised machine learning model for discovering latent infectious diseases using social media data.

Sunghoon Lim1, Conrad S Tucker2, Soundar Kumara1.   

Abstract

INTRODUCTION: The authors of this work propose an unsupervised machine learning model that has the ability to identify real-world latent infectious diseases by mining social media data. In this study, a latent infectious disease is defined as a communicable disease that has not yet been formalized by national public health institutes and explicitly communicated to the general public. Most existing approaches to modeling infectious-disease-related knowledge discovery through social media networks are top-down approaches that are based on already known information, such as the names of diseases and their symptoms. In existing top-down approaches, necessary but unknown information, such as disease names and symptoms, is mostly unidentified in social media data until national public health institutes have formalized that disease. Most of the formalizing processes for latent infectious diseases are time consuming. Therefore, this study presents a bottom-up approach for latent infectious disease discovery in a given location without prior information, such as disease names and related symptoms.
METHODS: Social media messages with user and temporal information are extracted during the data preprocessing stage. An unsupervised sentiment analysis model is then presented. Users' expressions about symptoms, body parts, and pain locations are also identified from social media data. Then, symptom weighting vectors for each individual and time period are created, based on their sentiment and social media expressions. Finally, latent-infectious-disease-related information is retrieved from individuals' symptom weighting vectors. DATASETS AND
RESULTS: Twitter data from August 2012 to May 2013 are used to validate this study. Real electronic medical records for 104 individuals, who were diagnosed with influenza in the same period, are used to serve as ground truth validation. The results are promising, with the highest precision, recall, and F1 score values of 0.773, 0.680, and 0.724, respectively.
CONCLUSION: This work uses individuals' social media messages to identify latent infectious diseases, without prior information, quicker than when the disease(s) is formalized by national public health institutes. In particular, the unsupervised machine learning model using user, textual, and temporal information in social media data, along with sentiment analysis, identifies latent infectious diseases in a given location. Copyright Â
© 2016 Elsevier Inc. All rights reserved.

Entities:  

Keywords:  Information retrieval; Latent infectious diseases; Sentiment analysis; Social media; Unsupervised machine learning

Mesh:

Year:  2016        PMID: 28034788     DOI: 10.1016/j.jbi.2016.12.007

Source DB:  PubMed          Journal:  J Biomed Inform        ISSN: 1532-0464            Impact factor:   6.317


  18 in total

1.  A systematic literature review of machine learning in online personal health data.

Authors:  Zhijun Yin; Lina M Sulieman; Bradley A Malin
Journal:  J Am Med Inform Assoc       Date:  2019-06-01       Impact factor: 4.497

Review 2.  A scoping review of the use of Twitter for public health research.

Authors:  Oduwa Edo-Osagie; Beatriz De La Iglesia; Iain Lake; Obaghe Edeghere
Journal:  Comput Biol Med       Date:  2020-05-16       Impact factor: 4.589

3.  Power of big data to improve patient care in gastroenterology.

Authors:  Jamie Catlow; Benjamin Bray; Eva Morris; Matt Rutter
Journal:  Frontline Gastroenterol       Date:  2021-05-28

4.  An unsupervised machine learning method for discovering patient clusters based on genetic signatures.

Authors:  Christian Lopez; Scott Tucker; Tarik Salameh; Conrad Tucker
Journal:  J Biomed Inform       Date:  2018-07-29       Impact factor: 6.317

Review 5.  A scoping review of the use of Twitter for public health research.

Authors:  Oduwa Edo-Osagie; Beatriz De La Iglesia; Iain Lake; Obaghe Edeghere
Journal:  Comput Biol Med       Date:  2020-05-16       Impact factor: 4.589

6.  Views on social media and its linkage to longitudinal data from two generations of a UK cohort study.

Authors:  Oliver S P Davis; Claire M A Haworth; Nina H Di Cara; Andy Boyd; Alastair R Tanner; Tarek Al Baghal; Lisa Calderwood; Luke S Sloan
Journal:  Wellcome Open Res       Date:  2020-08-12

Review 7.  Identifying Methods for Monitoring Foodborne Illness: Review of Existing Public Health Surveillance Techniques.

Authors:  Rachel A Oldroyd; Michelle A Morris; Mark Birkin
Journal:  JMIR Public Health Surveill       Date:  2018-06-06

8.  Over a decade of social opinion mining: a systematic review.

Authors:  Keith Cortis; Brian Davis
Journal:  Artif Intell Rev       Date:  2021-06-25       Impact factor: 8.139

Review 9.  Utility of Artificial Intelligence Amidst the COVID 19 Pandemic: A Review.

Authors:  Agam Bansal; Rana Prathap Padappayil; Chandan Garg; Anjali Singal; Mohak Gupta; Allan Klein
Journal:  J Med Syst       Date:  2020-08-01       Impact factor: 4.460

Review 10.  Sentiment Analysis in Health and Well-Being: Systematic Review.

Authors:  Anastazia Zunic; Padraig Corcoran; Irena Spasic
Journal:  JMIR Med Inform       Date:  2020-01-28
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.