| Literature DB >> 34322573 |
Yifan Li1, Xuan Pei2, Yandong Guo2.
Abstract
Purpose: The coronavirus disease (COVID-19) has been spreading rapidly around the world. As of August 25, 2020, 23.719 million people have been infected in many countries. The cumulative death toll exceeds 812,000. Early detection of COVID-19 is essential to provide patients with appropriate medical care and protecting uninfected people. Approach: Leveraging a large computed tomography (CT) database from 1112 patients provided by China Consortium of Chest CT Image Investigation (CC-CCII), we investigated multiple solutions in detecting COVID-19 and distinguished it from other common pneumonia (CP) and normal controls. We also compared the performance of different models for complete and segmented CT slices. In particular, we studied the effects of CT-superimposition depths into volumes on the performance of our models.Entities:
Keywords: COVID-19; classification network; computer tomography; deep learning; radiography
Year: 2021 PMID: 34322573 PMCID: PMC8304701 DOI: 10.1117/1.JMI.8.S1.017502
Source DB: PubMed Journal: J Med Imaging (Bellingham) ISSN: 2329-4302
Fig. 1Typical transverse-section CT images: (a) complete CT images and (b) segmented CT images.
Complete CT dataset of characteristics in identifying COVID-19 from other CP and normal controls.
| Complete CT | COVID-19 | CP | Normal controls | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Patients | Scans | Slices | Patients | Scans | Slices | Patients | Scans | Slices | |
| Train | 149 | 270 | 28,088 | 112 | 146 | 27,252 | 140 | 308 | 27,456 |
| Valid | 53 | 92 | 9364 | 45 | 52 | 9108 | 54 | 85 | 9152 |
| Test | 44 | 87 | 9236 | 44 | 44 | 8520 | 50 | 102 | 9080 |
| Total | 246 | 449 | 46,688 | 201 | 242 | 44,880 | 244 | 495 | 45,688 |
Segmented CT dataset of characteristics in identifying COVID-19 from other CP and normal controls.
| Segmented CT | COVID-19 | CP | Normal controls | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Patients | Scans | Slices | Patients | Scans | Slices | Patients | Scans | Slices | |
| Train | 79 | 98 | 11,648 | 2 | 3 | 222 | 175 | 175 | 14,144 |
| Valid | 26 | 26 | 3904 | 1 | 1 | 77 | 56 | 56 | 4736 |
| Test | 25 | 32 | 3456 | 1 | 1 | 66 | 56 | 56 | 4608 |
| Total | 130 | 156 | 19,008 | 4 | 5 | 365 | 287 | 287 | 23,488 |
Network architectures. Each convolutional layer is followed by batch normalization and a ReLU activation function. Downsampling is performed in the first convolutional layer of each block with the stride of 2. is the number of feature channels corresponding in Fig. 2, and is the number of blocks in each layer.
| Layer name | Architecture | |||||
|---|---|---|---|---|---|---|
| 18-layer | 34-layer | 50-layer | ||||
| F | N | F | N | F | N | |
| Conv1 | ||||||
| Conv2 | ||||||
| 64 | 2 | 64 | 3 | 64 | 3 | |
| Conv3 | 128 | 2 | 128 | 4 | 128 | 4 |
| Conv4 | 256 | 2 | 256 | 6 | 256 | 6 |
| Conv5 | 512 | 2 | 512 | 3 | 512 | 3 |
| Global average pool, fully connected, softmax layer | ||||||
| Block | Basic | Basic | Bottleneck | |||
Fig. 2Blocks of architectures. denotes the kernel size, and denotes the number of feature channels.
Fig. 33D versus (2+1)D Convolution. (a) Full 3D convolution and (b) (2+1)D convolution.
Accuracy and recall of complete and segmented CT Images. Accuracy is for three-way classification. -score, precision, and recall are for binary classification for COVID-19 and the two other classes. The results are represented as average value (the lower bound of 95% confidence interval and the upper bound of 95% confidence interval) generated by bootstrap.
| Depth and batch size | Slices | Accuracy (95% CI) | Precision (95% CI) | Recall (95% CI) | |
|---|---|---|---|---|---|
| 64 8 (complete) | 137,256 | 0.9297 (0.9269, 0.9324) | 0.9237 (0.9201, 0.9273) | 0.9409 (0.9366, 0.9453) | 0.9075 (0.9022, 0.9129) |
| 64 8 (complete, same number) | 42,861 | 0.9822 (0.9568, 1.0000) | 0.9819 (0.9558, 1.0000) | 0.9815 (0.9454, 1.0000) | 0.9826 (0.9461, 1.0000) |
| 64 8 (segmented) | 42,861 | 0.9255 (0.8750, 0.9761) | 0.9214 (0.8660, 0.9767) | 0.9594 (0.9036, 1.0000) | 0.8875 (0.8017, 0.9733) |
Effect of depths on different metrics. Accuracy is for three-way classification. -score, precision, and recall are for binary classification for COVID-19 and the two other classes. The results are represented as average value (the lower bound of 95% confidence interval and the upper bound of 95% confidence interval) generated by bootstrap. Bold font indicates the best result. Fisher’s exact test was used to investigate if the improvement in results is significant between the first group and the others. The value of indicate statistical significance as assessed by two-sided Fisher’s exact tests. “*” means , “**” means and “***” means .
| Depth | Batch size | Accuracy (95% CI) | F1-score (95% CI) | Precision (95% CI) | Recall (95% CI) |
|---|---|---|---|---|---|
| 2 | 8 | 0.8481 (0.8421, 0.8542) | 0.9587 (0.9548, 0.9627) | 0.9387 (0.9323, 0.9452) | |
| 4 | 8 | 0.9547*** (0.9499, 0.9597) | 0.9663** (0.9610, 0.9715) | 0.9405*** (0.9310, 0.9500) | |
| 8 | 8 | 0.9928*** (0.9877, 0.9978) | 0.9477*** (0.9351, 0.9604) | ||
| 16 | 8 | 0.9677*** (0.9556, 0.9798) | 0.9576 (0.9402, 0.9751) | 0.9647** (0.9431, 0.9863) | 0.9508*** (0.9245, 0.9771) |
| 32 | 8 | 0.9640*** (0.9550, 0.9730) | 0.9504 (0.9373, 0.9634) | 0.9665*** (0.9515, 0.9816) | 0.9348*** (0.9140, 0.9556) |
| 64 | 8 | 0.9298*** (0.9057, 0.9540) | 0.9241** (0.8916, 0.9567) | 0.9412 (0.9010, 0.9814) | 0.9082*** (0.8596, 0.9568) |
Fig. 4The effect of depths on different metrics.
Effect of batch sizes on different metrics. Accuracy is for three-way classification. -score, precision, and recall are for binary classification for COVID-19 and the two other classes. The results are represented as average value (the lower bound of 95% confidence interval and the upper bound of 95% confidence interval) generated by bootstrap. Bold font indicates the best result. Fisher’s exact test was used to investigate if the improvement in results is significant between the first group and the others. The value of indicate statistical significance as assessed by two-sided Fisher’s exact tests. “*” means , “**” means and “***” means .
| Depth | Batch size | Accuracy (95% CI) | Precision (95% CI) | Recall (95% CI) | |
|---|---|---|---|---|---|
| 2 | 32 | 0.9914 (0.9897, 0.9930) | 0.9950 (0.9929, 0.9972) | 0.9991 (0.9983, 1.0000) | |
| 2 | 64 | 0.9924 (0.9910, 0.9939) | 0.9952** (0.9939, 0.9966) | 0.9914*** (0.9888, 0.9939) | 0.9991 (0.9983, 0.9999) |
| 4 | 32 | 0.9965 (0.9948, 0.9982) | 0.9935 (0.9903, 0.9968) | ||
| 4 | 64 | 0.9832*** (0.9802, 0.9862) | 0.9924*** (0.9899, 0.9949) | 0.9904*** (0.9865, 0.9944) | 0.9943*** (0.9913, 0.9974) |
| 8 | 32 | 0.9815*** (0.9751, 0.9880) | 0.9878*** (0.9815, 0.9941) | 0.9913 (0.9839, 0.9986) | 0.9843*** (0.9741, 0.9946) |
| 16 | 32 | 0.9790*** (0.9743, 0.9838) | 0.9782*** (0.9722, 0.9843) | 0.9608*** (0.9497, 0.9718) |
Fig. 5The effect of batch sizes on different metrics.
Comparison of classification results using different models. Accuracy is for three-way classification. -score, precision, and recall are for binary classification for COVID-19 and the two other classes. The results are represented as average value (the lower bound of 95% confidence interval and the upper bound of 95% confidence interval) generated by bootstrap. Bold font indicates the best model group. Fisher’s exact test was used to investigate if the improvement in results is significant between the first group and the others. The value of indicate statistical significance as assessed by two-sided Fisher’s exact tests. “*” means , “**” means and “***” means .
| Model | Accuracy (95% CI) | F1-score (95% CI) | Precision (95% CI) | Recall (95% CI) |
|---|---|---|---|---|
| 0.9924 (0.9902, 0.9945) | 0.9965 (0.9948, 0.9982) | 0.9935 (0.9903, 0.9968) | 0.9996 (0.9987, 1.0000) | |
| ResNet-18(2+1)D | 0.9885* (0.9860, 0.9910) | 0.9957 (0.9938, 0.9976) | 0.9935 (0.9903, 0.9968) | 0.9978** (0.9959, 0.9997) |
| ResNet-34 | 0.9800*** (0.9767, 0.9834) | 0.9816*** (0.9776, 0.9856) | 0.9920 (0.9884, 0.9956) | 0.9714*** (0.9645, 0.9783) |
| ResNet-34(2+1)D | 0.9719*** (0.9679, 0.9758) | 0.9769*** (0.9725, 0.9814) | 0.9911 (0.9871, 0.9951) | 0.9632*** (0.9555, 0.9709) |
| ResNet-50 | 0.9801*** (0.9768, 0.9834) | 0.9831*** (0.9793, 0.9870) | 0.9942 (0.9912, 0.9973) | 0.9723*** (0.9655, 0.9791) |
Fig. 6Normalized confusion matrix of depth 4 and batch size 32. The model used is 3D ResNet-18.
Fig. 7Saliency maps of the Smooth Grad-CAM++ algorithm. (a), (d), (g) The first column contains slices with NCP, (b), (e), (h) the second column contains slices with CP, and (c), (f), (i) the third column contains slices in the normal control group.
Fig. 8Training and validation losses in the training stage.
Fig. 9Accuracy curve of training and validation sets in the training stage.