| Literature DB >> 33711739 |
Aoxiao Zhong1, Xiang Li2, Dufan Wu2, Hui Ren2, Kyungsang Kim2, Younggon Kim2, Varun Buch3, Nir Neumark3, Bernardo Bizzo3, Won Young Tak4, Soo Young Park4, Yu Rim Lee4, Min Kyu Kang5, Jung Gil Park5, Byung Seok Kim6, Woo Jin Chung7, Ning Guo2, Ittai Dayan8, Mannudeep K Kalra2, Quanzheng Li9.
Abstract
In recent years, deep learning-based image analysis methods have been widely applied in computer-aided detection, diagnosis and prognosis, and has shown its value during the public health crisis of the novel coronavirus disease 2019 (COVID-19) pandemic. Chest radiograph (CXR) has been playing a crucial role in COVID-19 patient triaging, diagnosing and monitoring, particularly in the United States. Considering the mixed and unspecific signals in CXR, an image retrieval model of CXR that provides both similar images and associated clinical information can be more clinically meaningful than a direct image diagnostic model. In this work we develop a novel CXR image retrieval model based on deep metric learning. Unlike traditional diagnostic models which aim at learning the direct mapping from images to labels, the proposed model aims at learning the optimized embedding space of images, where images with the same labels and similar contents are pulled together. The proposed model utilizes multi-similarity loss with hard-mining sampling strategy and attention mechanism to learn the optimized embedding space, and provides similar images, the visualizations of disease-related attention maps and useful clinical information to assist clinical decisions. The model is trained and validated on an international multi-site COVID-19 dataset collected from 3 different sources. Experimental results of COVID-19 image retrieval and diagnosis tasks show that the proposed model can serve as a robust solution for CXR analysis and patient management for COVID-19. The model is also tested on its transferability on a different clinical decision support task for COVID-19, where the pre-trained model is applied to extract image features from a new dataset without any further training. The extracted features are then combined with COVID-19 patient's vitals, lab tests and medical histories to predict the possibility of airway intubation in 72 hours, which is strongly associated with patient prognosis, and is crucial for patient care and hospital resource planning. These results demonstrate our deep metric learning based image retrieval model is highly efficient in the CXR retrieval, diagnosis and prognosis, and thus has great clinical value for the treatment and management of COVID-19 patients.Entities:
Keywords: COVID-19; Chest radiograph; Image content query; Image retrieval
Mesh:
Year: 2021 PMID: 33711739 PMCID: PMC8032481 DOI: 10.1016/j.media.2021.101993
Source DB: PubMed Journal: Med Image Anal ISSN: 1361-8415 Impact factor: 8.545
Fig. 1Computational pipeline of CXR image retrieval model in a COVID-19 diagnosis context.
Number of hospitals and number of images involved in each data site, with break-down of patient types, average age and gender ratio.
| Number of Hospitals | Total Images | Control | Non-COVID Pneumonia | COVID-19 | Age | Gender | |
|---|---|---|---|---|---|---|---|
| N/A | 13,970 | 8,066 | 5,551 | 353 | N/A | N/A | |
| 5 | 823 | 107 | 212 | 504 | 58.03 | 56.6% Male | |
| 4 | 3,262 | N/A | N/A | 3,262 | 57.31 | 35.8% Male |
Sample sizes and splitting of training/validation data of the three (COVIDx, Partners and Korean) data sites used in this work.
| Train | Validation | |||||||
|---|---|---|---|---|---|---|---|---|
| Total | COVIDx | Partners | Korean | Total | COVIDx | Partners | Korean | |
| 8,064 | 7,966 | 98 | N/A | 109 | 100 | 9 | N/A | |
| 5,641 | 5,451 | 190 | N/A | 122 | 100 | 22 | N/A | |
| 3,746 | 253 | 453 | 3040 | 373 | 100 | 51 | 222 | |
Fig. 2(a) Sample visualizations of the returned CXR images by the proposed model with query CXR image from a mild COVID-19 patient. Possible lesion regions are marked by red bounding boxes, with zooming in to detailed textures in the lesion region. (b) With query CXR image from a severe COVID-19 patient. (c) With query CXR image from non-COVID pneumonia patient, note that only COVIDx dataset contains this type of images. (d) With query CXR image from control, note that about 99% of the controls are from COVIDx dataset.
Fig. 3Visualization of CXR images and the corresponding attention maps from COVID-19 patients with different RALE scores, which indicates their severity of disease.
Model performance comparison between the proposed and baseline model, evaluated by averaged recall rate across all validation samples under different parameter k.
| Proposed System | Basline (Resnet-50) | |||||||
|---|---|---|---|---|---|---|---|---|
| 66.1% | 81.7% | 84.4% | 93.6% | 74.3% | 89.0% | 95.4% | 97.2% | |
| 87.7% | 91.8% | 91.8% | 94.3% | 82.8% | 87.7% | 90.2% | 93.4% | |
| 83.6% | 87.9% | 90.1% | 92.5% | 80.4% | 86.3% | 89.8% | 92.5% | |
Fig. 4Top panel: pipeline of the image retrieval process implemented by the baseline direct classification network (raw Resnet-50). Bottom panel: comparison of retrieved top four images between the baseline model and proposed model, using the same sample query image (COVID-19 from Partners). Images retrieved by the proposed model (the same as in Fig. 2) are also listed here for reference.
Model performance evaluated by the averaged accuracy, sensitivity and PPV for each type in the validation dataset. Left panel: performance of the proposed model. Right panel: performance of the baseline Resnet-50 model. Better performance between the two models are highlighted by bold text.
| Proposed System | Basel ine (Resnet-50) | |||||||
|---|---|---|---|---|---|---|---|---|
| Overal l | COVIDx | Partners | Korean | Overal l | COVIDx | Partners | Korean | |
| 81.5% | 75.3% | 61.0% | 97.3% | |||||
| 74.3% | 75.0% | 66.7% | N/A | 66.7% | N/A | |||
| 31.6% | N/A | 74.8% | 86.5% | 31.6% | N/A | |||
| 93.0% | N/A | 82.8% | 27.3% | N/A | ||||
| 62.8% | N/A | 61.6% | 54.5% | N/A | ||||
| 72.5% | 98.2% | 82.6% | 54.0% | 97.3% | ||||
| 100.0% | 93.6% | 88.5% | 73.1% | 100.0% | ||||
Fig. 5(a) Model performance (as measured in classification accuracy) with different value of k for KNN classifier. (b) Model performance with different sizes of image embedding, which is served as input to the metric learning module.
Fig. 6ROC curve of the 72-hours patient intervention prediction model using combined features (black), CXR-derived (blue) and EHR-derived (red) features as input. Mean ROC across the 5 cross-validation is illustrated as the solid curve, ±1 standard deviation is illustrated as area around the mean curve.