| Literature DB >> 34812384 |
Omar Del Tejo Catala1, Ismael Salvador Igual1, Francisco Javier Perez-Benito1, David Millan Escriva1, Vicent Ortiz Castello1, Rafael Llobet1,2, Juan-Carlos Perez-Cortes1,3.
Abstract
Chest X-ray images are useful for early COVID-19 diagnosis with the advantage that X-ray devices are already available in health centers and images are obtained immediately. Some datasets containing X-ray images with cases (pneumonia or COVID-19) and controls have been made available to develop machine-learning-based methods to aid in diagnosing the disease. However, these datasets are mainly composed of different sources coming from pre-COVID-19 datasets and COVID-19 datasets. Particularly, we have detected a significant bias in some of the released datasets used to train and test diagnostic systems, which might imply that the results published are optimistic and may overestimate the actual predictive capacity of the techniques proposed. In this article, we analyze the existing bias in some commonly used datasets and propose a series of preliminary steps to carry out before the classic machine learning pipeline in order to detect possible biases, to avoid them if possible and to report results that are more representative of the actual predictive power of the methods under analysis. This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/.Entities:
Keywords: COVID-19; Deep learning; bias; chest X-ray; convolutional neural networks; saliency map; segmentation
Year: 2021 PMID: 34812384 PMCID: PMC8545228 DOI: 10.1109/ACCESS.2021.3065456
Source DB: PubMed Journal: IEEE Access ISSN: 2169-3536 Impact factor: 3.476