Literature DB >> 35006495

Deep learning prediction of sex on chest radiographs: a potential contributor to biased algorithms.

David Li1,2, Cheng Ting Lin3, Jeremias Sulam4, Paul H Yi5.   

Abstract

BACKGROUND: Deep convolutional neural networks (DCNNs) for diagnosis of disease on chest radiographs (CXR) have been shown to be biased against males or females if the datasets used to train them have unbalanced sex representation. Prior work has suggested that DCNNs can predict sex on CXR, which could aid forensic evaluations, but also be a source of bias.
OBJECTIVE: To (1) evaluate the performance of DCNNs for predicting sex across different datasets and architectures and (2) evaluate visual biomarkers used by DCNNs to predict sex on CXRs.
MATERIALS AND METHODS: Chest radiographs were obtained from the Stanford CheXPert and NIH Chest XRay14 datasets which comprised of 224,316 and 112,120 CXRs, respectively. To control for dataset size and class imbalance, random undersampling was used to reduce each dataset to 97,560 images that were balanced for sex. Each dataset was randomly split into training (70%), validation (10%), and test (20%) sets. Four DCNN architectures pre-trained on ImageNet were used for transfer learning. DCNNs were externally validated using a test set from the opposing dataset. Performance was evaluated using area under the receiver operating characteristic curve (AUC). Class activation mapping (CAM) was used to generate heatmaps visualizing the regions contributing to the DCNN's prediction.
RESULTS: On the internal test set, DCNNs achieved AUROCs ranging from 0.98 to 0.99. On external validation, the models reached peak cross-dataset performance of 0.94 for the VGG19-Stanford model and 0.95 for the InceptionV3-NIH model. Heatmaps highlighted similar regions of attention between model architectures and datasets, localizing to the mediastinal and upper rib regions, as well as to the lower chest/diaphragmatic regions.
CONCLUSION: DCNNs trained on two large CXR datasets accurately predicted sex on internal and external test data with similar heatmap localizations across DCNN architectures and datasets. These findings support the notion that DCNNs can leverage imaging biomarkers to predict sex and potentially confound the accurate prediction of disease on CXRs and contribute to biased models. On the other hand, these DCNNs can be beneficial to emergency radiologists for forensic evaluations and identifying patient sex for patients whose identities are unknown, such as in acute trauma.
© 2022. American Society of Emergency Radiology.

Entities:  

Keywords:  Anatomy; Bias; Chest; Deep learning; Fairness; Forensics; Radiograph; Sex prediction

Mesh:

Year:  2022        PMID: 35006495     DOI: 10.1007/s10140-022-02019-3

Source DB:  PubMed          Journal:  Emerg Radiol        ISSN: 1070-3004


  5 in total

1.  Deep Learning at Chest Radiography: Automated Classification of Pulmonary Tuberculosis by Using Convolutional Neural Networks.

Authors:  Paras Lakhani; Baskaran Sundaram
Journal:  Radiology       Date:  2017-04-24       Impact factor: 11.105

2.  Variable generalization performance of a deep learning model to detect pneumonia in chest radiographs: A cross-sectional study.

Authors:  John R Zech; Marcus A Badgeley; Manway Liu; Anthony B Costa; Joseph J Titano; Eric Karl Oermann
Journal:  PLoS Med       Date:  2018-11-06       Impact factor: 11.069

3.  Gender imbalance in medical imaging datasets produces biased classifiers for computer-aided diagnosis.

Authors:  Agostina J Larrazabal; Nicolás Nieto; Victoria Peterson; Diego H Milone; Enzo Ferrante
Journal:  Proc Natl Acad Sci U S A       Date:  2020-05-26       Impact factor: 11.205

4.  MIMIC-CXR, a de-identified publicly available database of chest radiographs with free-text reports.

Authors:  Alistair E W Johnson; Tom J Pollard; Seth J Berkowitz; Nathaniel R Greenbaum; Matthew P Lungren; Chih-Ying Deng; Roger G Mark; Steven Horng
Journal:  Sci Data       Date:  2019-12-12       Impact factor: 6.444

5.  Deep learning for chest radiograph diagnosis: A retrospective comparison of the CheXNeXt algorithm to practicing radiologists.

Authors:  Pranav Rajpurkar; Jeremy Irvin; Robyn L Ball; Kaylie Zhu; Brandon Yang; Hershel Mehta; Tony Duan; Daisy Ding; Aarti Bagul; Curtis P Langlotz; Bhavik N Patel; Kristen W Yeom; Katie Shpanskaya; Francis G Blankenberg; Jayne Seekins; Timothy J Amrhein; David A Mong; Safwan S Halabi; Evan J Zucker; Andrew Y Ng; Matthew P Lungren
Journal:  PLoS Med       Date:  2018-11-20       Impact factor: 11.069

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.