Literature DB >> 33484133

The risk of racial bias while tracking influenza-related content on social media using machine learning.

Brandon Lwowski1, Anthony Rios1.   

Abstract

OBJECTIVE: Machine learning is used to understand and track influenza-related content on social media. Because these systems are used at scale, they have the potential to adversely impact the people they are built to help. In this study, we explore the biases of different machine learning methods for the specific task of detecting influenza-related content. We compare the performance of each model on tweets written in Standard American English (SAE) vs African American English (AAE).
MATERIALS AND METHODS: Two influenza-related datasets are used to train 3 text classification models (support vector machine, convolutional neural network, bidirectional long short-term memory) with different feature sets. The datasets match real-world scenarios in which there is a large imbalance between SAE and AAE examples. The number of AAE examples for each class ranges from 2% to 5% in both datasets. We also evaluate each model's performance using a balanced dataset via undersampling.
RESULTS: We find that all of the tested machine learning methods are biased on both datasets. The difference in false positive rates between SAE and AAE examples ranges from 0.01 to 0.35. The difference in the false negative rates ranges from 0.01 to 0.23. We also find that the neural network methods generally has more unfair results than the linear support vector machine on the chosen datasets.
CONCLUSIONS: The models that result in the most unfair predictions may vary from dataset to dataset. Practitioners should be aware of the potential harms related to applying machine learning to health-related social media data. At a minimum, we recommend evaluating fairness along with traditional evaluation metrics.
© The Author(s) 2021. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For permissions, please email: journals.permissions@oup.com.

Entities:  

Keywords:  classification; deep learning; fairness; machine learning; social network

Mesh:

Substances:

Year:  2021        PMID: 33484133      PMCID: PMC7973478          DOI: 10.1093/jamia/ocaa326

Source DB:  PubMed          Journal:  J Am Med Inform Assoc        ISSN: 1067-5027            Impact factor:   4.497


  19 in total

1.  Convolutional Neural Networks for Biomedical Text Classification: Application in Indexing Biomedical Articles.

Authors:  Anthony Rios; Ramakanth Kavuluru
Journal:  ACM BCB       Date:  2015-09

2.  Semantics derived automatically from language corpora contain human-like biases.

Authors:  Aylin Caliskan; Joanna J Bryson; Arvind Narayanan
Journal:  Science       Date:  2017-04-14       Impact factor: 47.728

3.  Social media use and influenza vaccine uptake among White and African American adults.

Authors:  Naheed Ahmed; Sandra C Quinn; Gregory R Hancock; Vicki S Freimuth; Amelia Jamison
Journal:  Vaccine       Date:  2018-10-30       Impact factor: 3.641

4.  Dissecting racial bias in an algorithm used to manage the health of populations.

Authors:  Ziad Obermeyer; Brian Powers; Christine Vogeli; Sendhil Mullainathan
Journal:  Science       Date:  2019-10-25       Impact factor: 47.728

5.  Racial/Ethnic Disparities in Influenza Vaccination of Chronically Ill US Adults: The Mediating Role of Perceived Discrimination in Health Care.

Authors:  William K Bleser; Patricia Y Miranda; Muriel Jean-Jacques
Journal:  Med Care       Date:  2016-06       Impact factor: 2.983

6.  Influenza A (H7N9) and the importance of digital epidemiology.

Authors:  Marcel Salathé; Clark C Freifeld; Sumiko R Mekaru; Anna F Tomasulo; John S Brownstein
Journal:  N Engl J Med       Date:  2013-07-03       Impact factor: 91.245

7.  Predicting cancer outcomes from histology and genomics using convolutional networks.

Authors:  Pooya Mobadersany; Safoora Yousefi; Mohamed Amgad; David A Gutman; Jill S Barnholtz-Sloan; José E Velázquez Vega; Daniel J Brat; Lee A D Cooper
Journal:  Proc Natl Acad Sci U S A       Date:  2018-03-12       Impact factor: 11.205

8.  Data and systems for medication-related text classification and concept normalization from Twitter: insights from the Social Media Mining for Health (SMM4H)-2017 shared task.

Authors:  Abeed Sarker; Maksim Belousov; Jasper Friedrichs; Kai Hakala; Svetlana Kiritchenko; Farrokh Mehryary; Sifei Han; Tung Tran; Anthony Rios; Ramakanth Kavuluru; Berry de Bruijn; Filip Ginter; Debanjan Mahata; Saif M Mohammad; Goran Nenadic; Graciela Gonzalez-Hernandez
Journal:  J Am Med Inform Assoc       Date:  2018-10-01       Impact factor: 4.497

9.  Racial/Ethnic Differences in Influenza and Pneumococcal Vaccination Rates Among Older Adults in New York City and Los Angeles and Orange Counties.

Authors:  Stephanie C Tse; Laura C Wyatt; Chau Trinh-Shevrin; Simona C Kwon
Journal:  Prev Chronic Dis       Date:  2018-12-13       Impact factor: 2.830

10.  Quantifying SARS-CoV-2 transmission suggests epidemic control with digital contact tracing.

Authors:  Luca Ferretti; Chris Wymant; David Bonsall; Christophe Fraser; Michelle Kendall; Lele Zhao; Anel Nurtay; Lucie Abeler-Dörner; Michael Parker
Journal:  Science       Date:  2020-03-31       Impact factor: 47.728

View more
  3 in total

1.  Statistical quantification of confounding bias in machine learning models.

Authors:  Tamas Spisak
Journal:  Gigascience       Date:  2022-08-26       Impact factor: 7.658

Review 2.  Methods to Establish Race or Ethnicity of Twitter Users: Scoping Review.

Authors:  Su Golder; Robin Stevens; Karen O'Connor; Richard James; Graciela Gonzalez-Hernandez
Journal:  J Med Internet Res       Date:  2022-04-29       Impact factor: 7.076

3.  Patients and consumers (and the data they generate): an underutilized resource.

Authors:  Suzanne Bakken
Journal:  J Am Med Inform Assoc       Date:  2021-03-18       Impact factor: 4.497

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.