| Literature DB >> 33932751 |
Mikhail Goncharov1, Maxim Pisov2, Alexey Shevtsov3, Boris Shirokikh2, Anvar Kurmukov3, Ivan Blokhin4, Valeria Chernina4, Alexander Solovev5, Victor Gombolevskiy4, Sergey Morozov4, Mikhail Belyaev6.
Abstract
The current COVID-19 pandemic overloads healthcare systems, including radiology departments. Though several deep learning approaches were developed to assist in CT analysis, nobody considered study triage directly as a computer science problem. We describe two basic setups: Identification of COVID-19 to prioritize studies of potentially infected patients to isolate them as early as possible; Severity quantification to highlight patients with severe COVID-19, thus direct them to a hospital or provide emergency medical care. We formalize these tasks as binary classification and estimation of affected lung percentage. Though similar problems were well-studied separately, we show that existing methods could provide reasonable quality only for one of these setups. We employ a multitask approach to consolidate both triage approaches and propose a convolutional neural network to leverage all available labels within a single model. In contrast with the related multitask approaches, we show the benefit from applying the classification layers to the most spatially detailed feature map at the upper part of U-Net instead of the less detailed latent representation at the bottom. We train our model on approximately 1500 publicly available CT studies and test it on the holdout dataset that consists of 123 chest CT studies of patients drawn from the same healthcare system, specifically 32 COVID-19 and 30 bacterial pneumonia cases, 30 cases with cancerous nodules, and 31 healthy controls. The proposed multitask model outperforms the other approaches and achieves ROC AUC scores of 0.87±0.01 vs. bacterial pneumonia, 0.93±0.01 vs. cancerous nodules, and 0.97±0.01 vs. healthy controls in Identification of COVID-19, and achieves 0.97±0.01 Spearman Correlation in Severity quantification. We have released our code and shared the annotated lesions masks for 32 CT images of patients with COVID-19 from the test dataset.Entities:
Keywords: COVID-19; Chest computed tomography; Convolutional neural network; Triage
Mesh:
Year: 2021 PMID: 33932751 PMCID: PMC8015379 DOI: 10.1016/j.media.2021.102054
Source DB: PubMed Journal: Med Image Anal ISSN: 1361-8415 Impact factor: 8.545
Fig. 1A schematic representation of the automatic triage process. Left: the chronological order of the studies. Center: re-prioritized order to highlight findings requiring radiologist’s attention (P denotes COVID-19 Identification probability). Right: accompanying algorithm-generated X-ray-like series to assist the radiologist in fast decision making (color bar from green to red denotes Severity of local COVID-19-related changes).
Fig. 2An example of joint COVID-19 identification and severity estimation by the proposed method for several studies.
Overview of continuous output indices proposed in previous works. The Type column denotes score type: COVID-19 identification, COVID-19 severity or both. Type of the Identification is given in brackets COVID vs. : P - Pneumonia, NP - non-Pneumonia, HC - Healthy controls, N - Nodules, C - Cancer. The Metric column contains reported ROC AUC values unless otherwise indicated. Remarks. 1. Accuracy because ROC AUC was not reported. 2. The metric was provided for the identification problem only. 3. Pearson correlation. 4. The average volume error, measured in cm. 5. The paper does not provide a score, Dice score for the output masks is reported.
| Paper | Ranking score description | Type | Metric |
|---|---|---|---|
| Probabilities of 2.5D EfficientNet | Iden. (P) | 0.95 | |
| Probabilities of a NN for raidomics | Iden. (P) | Acc. | |
| Probabilities of RF for radiomics | Iden. (P) | 0.94 | |
| Probabilities of 2.5D ResNet-50 | Iden. (P, NP) | 0.96 | |
| Probabilities of a 3D Resnet-based NN | Iden. (HC, P) | 0.97 | |
| Probabilities of a 3D CNN | Iden. (HC, P) | 0.99 | |
| Probabilities of ResNet-50 | Iden. (HC, P, N) | 0.99 | |
| Custom aggregation of a 2D CNN predicitons | Iden. (HC, P) | 0.97 | |
| Fractions of affected slices (by 2D ResNet) | Iden. (HC, C) | 0.99 | |
| Probabilities of 3D U-Net (encoder part) | Iden. (HC, P) | 0.97 | |
| Probabilities of a 3D CNN | Iden. (HC) | 0.96 | |
| 2D Bounding boxes + post-processing | Iden. (other disease) | Acc. | |
| A score based on 2D ResNet attention | Both (fever) | 0.95 | |
| Affected lung percentage, a combined score | Sev. | Corr. | |
| Affected lung percentage by 2D U-Net | Sev. | N/A | |
| Affected lung percentage by non trainable CV | Sev. | Corr. | |
| Volume of segm. masks by a 3D CNN | Sev. | Vol. | |
| Segmentation mask | Sev. | Dice | |
| Random Forrest probabilities | Sev. | 0.91 |
Fig. 3Schematic representation of the Multitask-Spatial-1 model. Identification score is the probability of being a COVID-19 positive series; Severity score is calculated using predicted lesions’ mask and precomputed lungs’ masks.
Training, validation and test data splits for all triage models. For each method, we give the optimized training objectives in the corresponding table cells for the training datasets. Every column of Mosmed-Test dataset represents the metrics which are calculated using the corresponding test subset. Remarks. 1. pos. with mask/pos. mean COVID-19 positive cases with or without lesions mask, correspondingly, and neg. means COVID-19 negative cases. 2. DSC means Dice Score. 3. AUCs means ROC AUC COVID-19 vs. All, vs. Normal, vs. Bac. Pneum. and vs. Nodules. 4. Seg. BCE and class. BCE means segmentation and classification Binary Cross-Entropy correspondingly. 5. means Spearman’s . 6. Multitask-Latent, Multitask-Spatial-4, Multitask-Spatial-1.
| Training and validation datasets | Mosmed-test | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Mosmed-1110 | Medseg-29 | NSCLC-Radiomics | COVID-19 pos. | Bac. Pneum. | Nodules | Normal | |||
| Ground truth | pos. with mask | pos. | neg. | pos. with mask | neg. | pos. with mask | neg. | neg. | neg. |
| Num. of images | 50 | 806 | 254 | 29 | 402 | 32 | 30 | 30 | 31 |
| Thresholding | DSC | - | DSC | - | AUCs | AUCs | AUCs | AUCs | |
| 2D U-Net, 3D U-Net | Seg. BCE | - | Seg. BCE | - | AUCs, | AUCs | AUCs | AUCs | |
| 2D U-Net+ | Seg. BCE | - | Seg. BCE | Seg. BCE | AUCs, | AUCs | AUCs | AUCs | |
| ResNet-50 | - | Class. BCE | - | Class. BCE | AUCs | AUCs | AUCs | AUCs | |
| Multitask models | Seg. BCE | Class. BCE | Seg. BCE | Class. BCE | AUCs, | AUCs | AUCs | AUCs | |
Quantitative comparison of all the methods discussed in Section 2. Trade-off between qualities of COVID-19 identification and ranking by severity is observed for segmentation-based methods. The proposed Multitask-Spatial-1 model yields the best identification results. Results are given as .
| ROC AUC (COVID-19 vs. | Spearman’s | Dice Score | ||||
|---|---|---|---|---|---|---|
| vs. All others | vs. Normal | vs. Bac. Pneum. | vs. Nodules | |||
| Thresholding | ||||||
| 3D U-Net | ||||||
| 2D U-Net | ||||||
| 2D U-Net+ | ||||||
| ResNet-50 | N/A | N/A | ||||
| Multitask-Latent | ||||||
| Multitask-Spatial-4 | ||||||
| Multitask-Spatial-1 | ||||||
Fig. 4Examples of axial CT slices from the test dataset along with ground truth annotations (first row) and predicted masks (second row) of COVID-19-specific lesions. Column A: COVID-19 positive case; Column B: normal case; Column C: case with bacterial pneumonia. Lesions’ masks are represented by the contours of their borders for clarity.
Fig. 5COVID-19 triage: identification of COVID-19 positive patients (left) and ranking them in the descending order of severity (right) via the proposed single Multitask-Spatial-1 model. In the right plot bars correspond to the ranked studies. Absolute values of the predicted affected lungs fractions are represented as bars’ lengths along the -axis. The bars’ colors denote ground truth labels.
Fig. 6The comparison of visual subjective estimation and automatic segmentation for weakly annotated cases from the Mosmed-1110 dataset. Each distribution corresponds to a set of cases with the same Severity group according to the radiologist’s subjective judgment. The left y-axis shows the automatically estimated Severity by our method; the right one denotes expected Severity ranges that are [0; 25) for CT-1, [25; 50) for CT-2, [50; 75) for CT-3, [75; 100] for CT-4. The colored arrows denote the correspondence between some visually underestimated cases and their representative axial slices. Note the inconsistency of manual estimation.