Literature DB >> 35308985

Data and Model Biases in Social Media Analyses: A Case Study of COVID-19 Tweets.

Yunpeng Zhao1, Pengfei Yin1, Yongqiu Li1, Xing He1, Jingcheng Du2, Cui Tao2, Yi Guo1, Mattia Prosperi1, Pierangelo Veltri3, Xi Yang1, Yonghui Wu1, Jiang Bian1.   

Abstract

During the coronavirus disease pandemic (COVID-19), social media platforms such as Twitter have become a venue for individuals, health professionals, and government agencies to share COVID-19 information. Twitter has been a popular source of data for researchers, especially for public health studies. However, the use of Twitter data for research also has drawbacks and barriers. Biases appear everywhere from data collection methods to modeling approaches, and those biases have not been systematically assessed. In this study, we examined six different data collection methods and three different machine learning (ML) models-commonly used in social media analysis-to assess data collection bias and measure ML models' sensitivity to data collection bias. We showed that (1) publicly available Twitter data collection endpoints with appropriate strategies can collect data that is reasonably representative of the Twitter universe; and (2) careful examinations of ML models' sensitivity to data collection bias are critical. ©2021 AMIA - All rights reserved.

Entities:  

Mesh:

Year:  2022        PMID: 35308985      PMCID: PMC8861742     

Source DB:  PubMed          Journal:  AMIA Annu Symp Proc        ISSN: 1559-4076


  8 in total

1.  Mining Twitter to assess the determinants of health behavior toward human papillomavirus vaccination in the United States.

Authors:  Hansi Zhang; Christopher Wheldon; Adam G Dunn; Cui Tao; Jinhai Huo; Rui Zhang; Mattia Prosperi; Yi Guo; Jiang Bian
Journal:  J Am Med Inform Assoc       Date:  2020-02-01       Impact factor: 4.497

2.  Garbage in, Garbage Out: Data Collection, Quality Assessment and Reporting Standards for Social Media Data Use in Health Research, Infodemiology and Digital Disease Detection.

Authors:  Yoonsang Kim; Jidong Huang; Sherry Emery
Journal:  J Med Internet Res       Date:  2016-02-26       Impact factor: 5.428

3.  Using Twitter to Measure Public Discussion of Diseases: A Case Study.

Authors:  Christopher Weeg; H Andrew Schwartz; Shawndra Hill; Raina M Merchant; Catalina Arango; Lyle Ungar
Journal:  JMIR Public Health Surveill       Date:  2015-06-26

4.  Using Social Media Data to Understand the Impact of Promotional Information on Laypeople's Discussions: A Case Study of Lynch Syndrome.

Authors:  Jiang Bian; Yunpeng Zhao; Ramzi G Salloum; Yi Guo; Mo Wang; Mattia Prosperi; Hansi Zhang; Xinsong Du; Laura J Ramirez-Diaz; Zhe He; Yuan Sun
Journal:  J Med Internet Res       Date:  2017-12-13       Impact factor: 5.428

Review 5.  User's guide to correlation coefficients.

Authors:  Haldun Akoglu
Journal:  Turk J Emerg Med       Date:  2018-08-07

6.  Machine Learning to Detect Self-Reporting of Symptoms, Testing Access, and Recovery Associated With COVID-19 on Twitter: Retrospective Big Data Infoveillance Study.

Authors:  Tim Mackey; Vidya Purushothaman; Jiawei Li; Neal Shah; Matthew Nali; Cortni Bardier; Bryan Liang; Mingxiang Cai; Raphael Cuomo
Journal:  JMIR Public Health Surveill       Date:  2020-06-08

7.  Twitter Discussions and Emotions About the COVID-19 Pandemic: Machine Learning Approach.

Authors:  Jia Xue; Junxiang Chen; Ran Hu; Chen Chen; Chengda Zheng; Yue Su; Tingshao Zhu
Journal:  J Med Internet Res       Date:  2020-11-25       Impact factor: 5.428

8.  Coronavirus Goes Viral: Quantifying the COVID-19 Misinformation Epidemic on Twitter.

Authors:  Ramez Kouzy; Joseph Abi Jaoude; Afif Kraitem; Molly B El Alam; Basil Karam; Elio Adib; Jabra Zarka; Cindy Traboulsi; Elie W Akl; Khalil Baddour
Journal:  Cureus       Date:  2020-03-13
  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.