Shunit Agmon1, Plia Gillis2, Eric Horvitz3, Kira Radinsky1. 1. Computer Science Faculty, Technion - Israel Institute of Technology, Haifa, Israel. 2. Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel. 3. Microsoft Research, Redmond, WA, USA.
Abstract
OBJECTIVE: To analyze gender bias in clinical trials, to design an algorithm that mitigates the effects of biases of gender representation on natural-language (NLP) systems trained on text drawn from clinical trials, and to evaluate its performance. MATERIALS AND METHODS: We analyze gender bias in clinical trials described by 16 772 PubMed abstracts (2008-2018). We present a method to augment word embeddings, the core building block of NLP-centric representations, by weighting abstracts by the number of women participants in the trial. We evaluate the resulting gender-sensitive embeddings performance on several clinical prediction tasks: comorbidity classification, hospital length of stay prediction, and intensive care unit (ICU) readmission prediction. RESULTS: For female patients, the gender-sensitive model area under the receiver-operator characteristic (AUROC) is 0.86 versus the baseline of 0.81 for comorbidity classification, mean absolute error 4.59 versus the baseline of 4.66 for length of stay prediction, and AUROC 0.69 versus 0.67 for ICU readmission. All results are statistically significant. DISCUSSION: Women have been underrepresented in clinical trials. Thus, using the broad clinical trials literature as training data for statistical language models could result in biased models, with deficits in knowledge about women. The method presented enables gender-sensitive use of publications as training data for word embeddings. In experiments, the gender-sensitive embeddings show better performance than baseline embeddings for the clinical tasks studied. The results highlight opportunities for recognizing and addressing gender and other representational biases in the clinical trials literature. CONCLUSION: Addressing representational biases in data for training NLP embeddings can lead to better results on downstream tasks for underrepresented populations.
OBJECTIVE: To analyze gender bias in clinical trials, to design an algorithm that mitigates the effects of biases of gender representation on natural-language (NLP) systems trained on text drawn from clinical trials, and to evaluate its performance. MATERIALS AND METHODS: We analyze gender bias in clinical trials described by 16 772 PubMed abstracts (2008-2018). We present a method to augment word embeddings, the core building block of NLP-centric representations, by weighting abstracts by the number of women participants in the trial. We evaluate the resulting gender-sensitive embeddings performance on several clinical prediction tasks: comorbidity classification, hospital length of stay prediction, and intensive care unit (ICU) readmission prediction. RESULTS: For female patients, the gender-sensitive model area under the receiver-operator characteristic (AUROC) is 0.86 versus the baseline of 0.81 for comorbidity classification, mean absolute error 4.59 versus the baseline of 4.66 for length of stay prediction, and AUROC 0.69 versus 0.67 for ICU readmission. All results are statistically significant. DISCUSSION: Women have been underrepresented in clinical trials. Thus, using the broad clinical trials literature as training data for statistical language models could result in biased models, with deficits in knowledge about women. The method presented enables gender-sensitive use of publications as training data for word embeddings. In experiments, the gender-sensitive embeddings show better performance than baseline embeddings for the clinical tasks studied. The results highlight opportunities for recognizing and addressing gender and other representational biases in the clinical trials literature. CONCLUSION: Addressing representational biases in data for training NLP embeddings can lead to better results on downstream tasks for underrepresented populations.
Authors: Clara Weil; Chizoba Nwankwo; Mira Friedman; Gabriel Kenet; Gabriel Chodick; Varda Shalev Journal: J Med Virol Date: 2015-11-18 Impact factor: 2.327
Authors: Steven Horng; David A Sontag; Yoni Halpern; Yacine Jernite; Nathan I Shapiro; Larry A Nathanson Journal: PLoS One Date: 2017-04-06 Impact factor: 3.240
Authors: Shikha Chaganti; Valerie F Welty; Warren Taylor; Kimberly Albert; Michelle D Failla; Carissa Cascio; Seth Smith; Louise Mawn; Susan M Resnick; Lori L Beason-Held; Francesca Bagnato; Thomas Lasko; Jeffrey D Blume; Bennett A Landman Journal: PLoS One Date: 2019-11-27 Impact factor: 3.240
Authors: Thomas Desautels; Ritankar Das; Jacob Calvert; Monica Trivedi; Charlotte Summers; David J Wales; Ari Ercole Journal: BMJ Open Date: 2017-09-15 Impact factor: 2.692
Authors: Andrew L Beam; Benjamin Kompa; Allen Schmaltz; Inbar Fried; Griffin Weber; Nathan Palmer; Xu Shi; Tianxi Cai; Isaac S Kohane Journal: Pac Symp Biocomput Date: 2020