Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Data and Model Biases in Social Media Analyses: A Case Study of COVID-19 Tweets.

Literature DB >> 35308985

Data and Model Biases in Social Media Analyses: A Case Study of COVID-19 Tweets.

Yunpeng Zhao¹, Pengfei Yin¹, Yongqiu Li¹, Xing He¹, Jingcheng Du², Cui Tao², Yi Guo¹, Mattia Prosperi¹, Pierangelo Veltri³, Xi Yang¹, Yonghui Wu¹, Jiang Bian¹.

Abstract

During the coronavirus disease pandemic (COVID-19), social media platforms such as Twitter have become a venue for individuals, health professionals, and government agencies to share COVID-19 information. Twitter has been a popular source of data for researchers, especially for public health studies. However, the use of Twitter data for research also has drawbacks and barriers. Biases appear everywhere from data collection methods to modeling approaches, and those biases have not been systematically assessed. In this study, we examined six different data collection methods and three different machine learning (ML) models-commonly used in social media analysis-to assess data collection bias and measure ML models' sensitivity to data collection bias. We showed that (1) publicly available Twitter data collection endpoints with appropriate strategies can collect data that is reasonably representative of the Twitter universe; and (2) careful examinations of ML models' sensitivity to data collection bias are critical. ©2021 AMIA - All rights reserved.

Entities: Chemical

Mesh：

Year: 2022 PMID： 35308985 PMCID： PMC8861742

Source DB: PubMed Journal: AMIA Annu Symp Proc ISSN： 1559-4076

Keyword Cloud
References

8 in total

1. Mining Twitter to assess the determinants of health behavior toward human papillomavirus vaccination in the United States.

Authors: Hansi Zhang; Christopher Wheldon; Adam G Dunn; Cui Tao; Jinhai Huo; Rui Zhang; Mattia Prosperi; Yi Guo; Jiang Bian
Journal: J Am Med Inform Assoc Date: 2020-02-01 Impact factor: 4.497

2. Garbage in, Garbage Out: Data Collection, Quality Assessment and Reporting Standards for Social Media Data Use in Health Research, Infodemiology and Digital Disease Detection.

Authors: Yoonsang Kim; Jidong Huang; Sherry Emery
Journal: J Med Internet Res Date: 2016-02-26 Impact factor: 5.428

3. Using Twitter to Measure Public Discussion of Diseases: A Case Study.

Authors: Christopher Weeg; H Andrew Schwartz; Shawndra Hill; Raina M Merchant; Catalina Arango; Lyle Ungar
Journal: JMIR Public Health Surveill Date: 2015-06-26

4. Using Social Media Data to Understand the Impact of Promotional Information on Laypeople's Discussions: A Case Study of Lynch Syndrome.

Authors: Jiang Bian; Yunpeng Zhao; Ramzi G Salloum; Yi Guo; Mo Wang; Mattia Prosperi; Hansi Zhang; Xinsong Du; Laura J Ramirez-Diaz; Zhe He; Yuan Sun
Journal: J Med Internet Res Date: 2017-12-13 Impact factor: 5.428

Review 5. User's guide to correlation coefficients.

Authors: Haldun Akoglu
Journal: Turk J Emerg Med Date: 2018-08-07

6. Machine Learning to Detect Self-Reporting of Symptoms, Testing Access, and Recovery Associated With COVID-19 on Twitter: Retrospective Big Data Infoveillance Study.

Authors: Tim Mackey; Vidya Purushothaman; Jiawei Li; Neal Shah; Matthew Nali; Cortni Bardier; Bryan Liang; Mingxiang Cai; Raphael Cuomo
Journal: JMIR Public Health Surveill Date: 2020-06-08

7. Twitter Discussions and Emotions About the COVID-19 Pandemic: Machine Learning Approach.

Authors: Jia Xue; Junxiang Chen; Ran Hu; Chen Chen; Chengda Zheng; Yue Su; Tingshao Zhu
Journal: J Med Internet Res Date: 2020-11-25 Impact factor: 5.428

8. Coronavirus Goes Viral: Quantifying the COVID-19 Misinformation Epidemic on Twitter.

Authors: Ramez Kouzy; Joseph Abi Jaoude; Afif Kraitem; Molly B El Alam; Basil Karam; Elio Adib; Jabra Zarka; Cindy Traboulsi; Elie W Akl; Khalil Baddour
Journal: Cureus Date: 2020-03-13

8 in total