| Literature DB >> 33372243 |
Qiuchen Xie1, Yiping Lu1, Xiancheng Xie2, Nan Mei1, Yun Xiong3, Xuanxuan Li1, Yangyong Zhu3, Anling Xiao4, Bo Yin5.
Abstract
OBJECTIVES: Based on the current clinical routine, we aimed to develop a novel deep learning model to distinguish coronavirus disease 2019 (COVID-19) pneumonia from other types of pneumonia and validate it with a real-world dataset (RWD).Entities:
Keywords: COVID-19; Deep learning; Differential diagnosis
Mesh:
Year: 2020 PMID: 33372243 PMCID: PMC7769567 DOI: 10.1007/s00330-020-07553-7
Source DB: PubMed Journal: Eur Radiol ISSN: 0938-7994 Impact factor: 5.315
Fig. 1The workflow of the whole study
Fig. 2The illustration of the network architectures of our proposed deep learning (DL) model, including U-net and COVIDNet. a U-net is composed of a two-stage segmentation module for acceleration. In the first stage, we down-sampled the input image to a 128 × 128 level and segmented the lung field from the image, as the patterns of lung fields were easily learned at a relatively low resolution. In the second stage, we first calculated the bounding box with the lung field segmentation results. The key region was cropped from the original input image and resized it to a 256 × 256 level as the input for the second stage segmentation model. b The 3D classification networks (COVIDNet) were used in our COVID-19 diagnosis system. It is a convolutional neural network using ResNet50 as the backbone. A series of CT images were fed into COVIDNet to generate feature maps following the feature fusion layer. The feature fusion layer consists of 2 convolution layers. The final extracted features were fed into a dense layer and SoftMax activation to generate the prediction for COVID-19 pneumonia
Clinical characteristics of patients in the study
| Characteristics | All patients ( | COVID-19 patients ( | Non-COVID patients ( | |
|---|---|---|---|---|
| Age | 46.90 ± 15.65 | 44.03 ± 14.62 | 52.88 ± 17.73 | |
| Gender, male/female | 383/313 | 275/195 | 108/118 | |
| Number of CT scans | 881 | 634 | 247 | / |
| Epidemiological history | ||||
| Yes/No | 361/335 | 307/163 | 54/172 | |
| Symptom | ||||
| Yes/No | 650/42 | 433/37 | 212/10 | 0.138 |
| Underlying comorbidity | ||||
| Yes/No | 193/503 | 112/358 | 81/145 | |
| Laboratory test | ||||
| White blood cell count, mean ± sd (× 109/L) | 7.01 ± 3.52 | 5.52 ± 2.31 | 10.10 ± 4.90 | |
| Lymphocyte count, mean ± sd (× 109/L) | 1.25 ± 0.60 | 1.13 ± 0.48 | 1.49 ± 0.89 | |
| Lactate dehydrogenase, mean ± sd (U/L) | 411.22 ± 214.86 | 262.35 ± 96.62 | 673.52 ± 454.87 | |
| C-reactive protein, mean ± sd (mg/L) | 34.75 ± 36.91 | 28.04 ± 34.93 | 48.69 ± 44.28 | |
| Procalcitonin, median mean ± sd, (ng/mL) | 1.24 ± 2.99 | 0.09 ± 0.37 | 3.62 ± 8.16 | |
| Final diagnosis | ||||
| COVID-19 | 470 | 470 | / | / |
| Bacterial infection | 106 | / | 106 | |
| Viral infection | 53 | / | 53 | |
| Others | 67 | / | 67 | |
The italics indicate significant p values
*Wilcoxon rank-sum test and Fisher exact test were used if non-normal distribution or heterogenous variance of the data was detected
Clinical characteristics in the model-training group and real-world data (RWD) group
| Characteristics | All patients ( | Model-training group ( | RWD group ( | |
|---|---|---|---|---|
| Age | 46.90 ± 15.65 | 44.03 ± 12.96 | 50.35 ± 16.29 | |
| Gender, male/female | 383/313 | 205/175 | 177/139 | 0.219 |
| Number of CT scans | 881 | 563 | 318 | / |
| Final diagnosis | ||||
| COVID-19 | 470 | 227 | 243 | |
| Bacterial infection | 106 | 63 | 43 | |
| Viral infection | 53 | 36 | 17 | |
| Others | 67 | 54 | 13 | |
The italics indicate significant p values
Wilcoxon rank-sum test and Fisher exact test were used if non-normal distribution or heterogenous variance of the data was detected
Model performance in the internal validation group and RWD group
| Group | Number of cases | Number of COVID-19 | Accuracy (%) | AUC | Sensitivity (%) | Specificity (%) |
|---|---|---|---|---|---|---|
| Internal validation group | 61 | 40 | 82 | 0.905 | 84 | 80 |
| RWD group | 316 | 243 | 81 | 0.868 | 81 | 82 |
AUC area under the curve
Fig. 3The performance of our DL model in the internal validation group and the real-world dataset (RWD) group. ROC curves and confusion matrixes were listed in the upper and lower part of the figure
Performance results of the three radiologists and the AI expert system in the RWD group
| Radiologist/model | No. of cases | Test performance | |||||||
|---|---|---|---|---|---|---|---|---|---|
| No. | TP | TN | FP | FN | Accuracy (%) | Sensitivity (%) | Specificity (%) | PPV (%) | NPV (%) |
| 1 | 180 | 62 | 11 | 63 | 76 [70, 81] (242/316) | 74 [66, 82] (180/243) | 85 [73, 93] (62/73) | 94 [76, 99] (180/191) | 49 [40, 59] (62/126) |
| 2 | 231 | 16 | 57 | 12 | 78 [67, 89] (247/316) | 95 [90, 99] (231/243) | 22 [16, 28] (16/73) | 80 [67, 92] (231/288) | 57 [54, 60] (16/28) |
| 3 | 170 | 68 | 5 | 73 | 75 [59, 89] (238/316) | 70 [55, 81] (170/243) | 93 [89, 96] (68/73) | 97 [83, 99] (170/175) | 48 [36, 60] (68/141) |
| IDANNet | 197 | 60 | 13 | 46 | 81 [77, 84] (180/243) | 81 [71, 91] (197/243) | 82 [78, 85] (60/73) | 94 [88, 97] (197/210) | 57 [50, 64] (60/106) |
Numbers in brackets are 95% confidence intervals, and numbers in parentheses are numbers of cases
TP true positive, FP false positive, TN true negative, FN false negative, PPV positive predictive value, NPV negative predictive value
Fig. 4The comparison of the diagnostic performance of RWD between three senior experienced radiologists and the AI system. The AI model operated at 81.1% sensitivity and 82.2% specificity (shown as the star) using a decision threshold set on the model development dataset. The performances of 3 experienced radiologists were labelled in dots
Fig. 5Three attention heatmaps from the last “pooling” layer in our DL model. The attention regions were overlapping with the ROIs acquired by human radiologists. All these cases were diagnosed as possible COVID-19 pneumonia by radiologists but correctly distinguished out by the DL model. Thus, it is desirable to investigate what exact imaging features are DL model based on and how AI acquires the classification potential to improve the CT-based identification capability of clinicians and radiologists. A typical CT image in a COVID-19 pneumonia patient is illustrated in 1a–1c with subpleural GGO and “crazy paving” sign inside the lesion. A non-typical COVID-19 image is shown in 2a–2c with total consolidation in the right inferior lobe and a non-COVID viral pneumonia case is presented in 3a–3c with typical COVID-19 CT manifestations