Literature DB >> 34918101

Gender-sensitive word embeddings for healthcare.

Shunit Agmon1, Plia Gillis2, Eric Horvitz3, Kira Radinsky1.   

Abstract

OBJECTIVE: To analyze gender bias in clinical trials, to design an algorithm that mitigates the effects of biases of gender representation on natural-language (NLP) systems trained on text drawn from clinical trials, and to evaluate its performance.
MATERIALS AND METHODS: We analyze gender bias in clinical trials described by 16 772 PubMed abstracts (2008-2018). We present a method to augment word embeddings, the core building block of NLP-centric representations, by weighting abstracts by the number of women participants in the trial. We evaluate the resulting gender-sensitive embeddings performance on several clinical prediction tasks: comorbidity classification, hospital length of stay prediction, and intensive care unit (ICU) readmission prediction.
RESULTS: For female patients, the gender-sensitive model area under the receiver-operator characteristic (AUROC) is 0.86 versus the baseline of 0.81 for comorbidity classification, mean absolute error 4.59 versus the baseline of 4.66 for length of stay prediction, and AUROC 0.69 versus 0.67 for ICU readmission. All results are statistically significant. DISCUSSION: Women have been underrepresented in clinical trials. Thus, using the broad clinical trials literature as training data for statistical language models could result in biased models, with deficits in knowledge about women. The method presented enables gender-sensitive use of publications as training data for word embeddings. In experiments, the gender-sensitive embeddings show better performance than baseline embeddings for the clinical tasks studied. The results highlight opportunities for recognizing and addressing gender and other representational biases in the clinical trials literature.
CONCLUSION: Addressing representational biases in data for training NLP embeddings can lead to better results on downstream tasks for underrepresented populations.
© The Author(s) 2021. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For permissions, please email: journals.permissions@oup.com.

Entities:  

Keywords:  algorithms; bias; gender; statistical models; word embeddings

Mesh:

Year:  2022        PMID: 34918101      PMCID: PMC8800511          DOI: 10.1093/jamia/ocab279

Source DB:  PubMed          Journal:  J Am Med Inform Assoc        ISSN: 1067-5027            Impact factor:   4.497


  33 in total

1.  Prevalence and factors associated with resistant hypertension in a large health maintenance organization in Israel.

Authors:  Dahlia Weitzman; Gabriel Chodick; Varda Shalev; Chagai Grossman; Ehud Grossman
Journal:  Hypertension       Date:  2014-06-23       Impact factor: 10.190

2.  Evaluation and accurate diagnoses of pediatric diseases using artificial intelligence.

Authors:  Huiying Liang; Brian Y Tsui; Hao Ni; Carolina C S Valentim; Sally L Baxter; Guangjian Liu; Wenjia Cai; Daniel S Kermany; Xin Sun; Jiancong Chen; Liya He; Jie Zhu; Pin Tian; Hua Shao; Lianghong Zheng; Rui Hou; Sierra Hewett; Gen Li; Ping Liang; Xuan Zang; Zhiqi Zhang; Liyan Pan; Huimin Cai; Rujuan Ling; Shuhua Li; Yongwang Cui; Shusheng Tang; Hong Ye; Xiaoyan Huang; Waner He; Wenqing Liang; Qing Zhang; Jianmin Jiang; Wei Yu; Jianqun Gao; Wanxing Ou; Yingmin Deng; Qiaozhen Hou; Bei Wang; Cuichan Yao; Yan Liang; Shu Zhang; Yaou Duan; Runze Zhang; Sarah Gibson; Charlotte L Zhang; Oulan Li; Edward D Zhang; Gabriel Karin; Nathan Nguyen; Xiaokang Wu; Cindy Wen; Jie Xu; Wenqin Xu; Bochu Wang; Winston Wang; Jing Li; Bianca Pizzato; Caroline Bao; Daoman Xiang; Wanting He; Suiqin He; Yugui Zhou; Weldon Haw; Michael Goldbaum; Adriana Tremoulet; Chun-Nan Hsu; Hannah Carter; Long Zhu; Kang Zhang; Huimin Xia
Journal:  Nat Med       Date:  2019-02-11       Impact factor: 53.440

3.  Epidemiology of endometriosis: a large population-based database study from a healthcare provider with 2 million members.

Authors:  V H Eisenberg; C Weil; G Chodick; V Shalev
Journal:  BJOG       Date:  2017-06-14       Impact factor: 6.531

4.  Epidemiology of hepatitis C virus infection in a large Israeli health maintenance organization.

Authors:  Clara Weil; Chizoba Nwankwo; Mira Friedman; Gabriel Kenet; Gabriel Chodick; Varda Shalev
Journal:  J Med Virol       Date:  2015-11-18       Impact factor: 2.327

5.  Women's involvement in clinical trials: historical perspective and future implications.

Authors:  Katherine A Liu; Natalie A Dipietro Mager
Journal:  Pharm Pract (Granada)       Date:  2016-03-15

6.  Creating an automated trigger for sepsis clinical decision support at emergency department triage using machine learning.

Authors:  Steven Horng; David A Sontag; Yoni Halpern; Yacine Jernite; Nathan I Shapiro; Larry A Nathanson
Journal:  PLoS One       Date:  2017-04-06       Impact factor: 3.240

7.  Discovering novel disease comorbidities using electronic medical records.

Authors:  Shikha Chaganti; Valerie F Welty; Warren Taylor; Kimberly Albert; Michelle D Failla; Carissa Cascio; Seth Smith; Louise Mawn; Susan M Resnick; Lori L Beason-Held; Francesca Bagnato; Thomas Lasko; Jeffrey D Blume; Bennett A Landman
Journal:  PLoS One       Date:  2019-11-27       Impact factor: 3.240

8.  Prediction of early unplanned intensive care unit readmission in a UK tertiary care hospital: a cross-sectional machine learning approach.

Authors:  Thomas Desautels; Ritankar Das; Jacob Calvert; Monica Trivedi; Charlotte Summers; David J Wales; Ari Ercole
Journal:  BMJ Open       Date:  2017-09-15       Impact factor: 2.692

9.  Clinical Concept Embeddings Learned from Massive Sources of Multimodal Medical Data.

Authors:  Andrew L Beam; Benjamin Kompa; Allen Schmaltz; Inbar Fried; Griffin Weber; Nathan Palmer; Xu Shi; Tianxi Cai; Isaac S Kohane
Journal:  Pac Symp Biocomput       Date:  2020

10.  BioBERT: a pre-trained biomedical language representation model for biomedical text mining.

Authors:  Jinhyuk Lee; Wonjin Yoon; Sungdong Kim; Donghyeon Kim; Sunkyu Kim; Chan Ho So; Jaewoo Kang
Journal:  Bioinformatics       Date:  2020-02-15       Impact factor: 6.937

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.