| Literature DB >> 32617690 |
Qianqian Ni1, Zhi Yuan Sun1, Li Qi1, Wen Chen2, Yi Yang3, Li Wang1, Xinyuan Zhang1, Liu Yang1, Yi Fang1, Zijian Xing4, Zhen Zhou5, Yizhou Yu6, Guang Ming Lu1, Long Jiang Zhang7,8.
Abstract
OBJECTIVES: To utilize a deep learning model for automatic detection of abnormalities in chest CT images from COVID-19 patients and compare its quantitative determination performance with radiological residents.Entities:
Keywords: COVID-19; Deep learning; Diagnosis; Multidetector computed tomography; Pneumonia
Mesh:
Year: 2020 PMID: 32617690 PMCID: PMC7331494 DOI: 10.1007/s00330-020-07044-9
Source DB: PubMed Journal: Eur Radiol ISSN: 0938-7994 Impact factor: 7.034
Fig. 1Flow diagram shows the overview of deep learning algorithm and participant selection
Overview of CT imaging features in 96 patients
| Variables | All, |
|---|---|
| Positive CT findings, | 88 (91.7%) |
| Numbers of lesions | |
| Solitary, | 14 (14.6%) |
| Multiple, | 74 (77.1%) |
| Number of lobes affected | |
| ≥ 2 lobes affected, | 66 (68.8%) |
| ≥ 3 lobes affected, | 54 (56.3%) |
| ≥ 4 lobes affected, | 43 (44.8%) |
| Locations | |
| Right lung, | 75 (78.1%) |
| Upper lobe, | 55 (57.3%) |
| Middle lobe, | 50 (52.1%) |
| Lower lobe, | 57 (59.4%) |
| Left lung, | 73 (76.0%) |
| Upper lobe, | 53 (55.2%) |
| Lower lobe, | 65 (67.7%) |
| Both lungs, | 75 (78.1%) |
| CT severity score | |
| Mean score, medium (quartile) | 0.25 (0.0875–0.60) |
| Mild ≤ 20%, | 30 (31.3%) |
| Moderate 20–50%, | 12 (12.5%) |
| Severe > 50%, | 54 (56.3%) |
| CT imaging features | |
| GGO, | 10 (10.4%) |
| Consolidation, | 3 (3.1%) |
| GGO + consolidation, | 75 (78.1%) |
| Rounded morphology, | 69 (71.9%) |
| Other morphology, | 67 (69.8%) |
| Crazy paving pattern, | 32 (33.3%) |
| Interstitial changes, | 50 (52.1%) |
| Subpleural distribution, | 82 (85.4%) |
| Diffuse distribution, | 5 (5.2%) |
Performance of deep learning model versus radiology residents
| Variables | Accuracy | Sensitivity | Specificity | PPV | NPV | F1 score | ||
|---|---|---|---|---|---|---|---|---|
| Patient level | ||||||||
| Model | 0.94 (0.87, 0.98) | 1.00 (0.96, 1.00) | 0.25 (0.03, 0.65) | 0.94 (0.91, 0.96) | 1.00 (0.63, 1.00) | 0.97 | ||
| Resident 1 | 0.95 (0.88, 0.98) | 0.94 (0.87, 0.98) | 1.00 (0.63, 1.00) | 1.00 (0.95, 1.00) | 0.62 (0.41, 0.79) | 0.97 | 0.0233 | 0.0019 |
| Resident 2 | 0.92 (0.84, 0.96) | 0.93 (0.86, 0.97) | 0.75 (0.35, 0.97) | 0.98 (0.93, 0.99) | 0.50 (0.30, 0.70) | 0.95 | 0.0126 | 0.0455 |
| Resident 3 | 0.90 (0.82, 0.95) | 0.89 (0.80, 0.94) | 1.00 (0.63, 1.00) | 1.00 (0.94, 1.00) | 0.44 (0.31, 0.60) | 0.94 | 0.0011 | 0.0019 |
| Lung lobe level | ||||||||
| Model | 0.82 (0.79, 0.86) | 0.96 (0.94, 0.98) | 0.63 (0.55, 0.69) | 0.78 (0.75, 0.81) | 0.93 (0.87, 0.96) | 0.86 | ||
| Resident 1 | 0.88 (0.85, 0.91) | 0.87 (0.82, 0.91) | 0.90 (0.84, 0.93) | 0.92 (0.89, 0.95) | 0.83 (0.78, 0.87) | 0.89 | < 0.0001 | < 0.0001 |
| Resident 2 | 0.88 (0.84, 0.90) | 0.84 (0.79, 0.88) | 0.93 (0.88, 0.96) | 0.94 (0.91, 0.96) | 0.80 (0.76, 0.84) | 0.89 | < 0.0001 | < 0.0001 |
| Resident 3 | 0.88 (0.85, 0.91) | 0.82 (0.77, 0.86) | 0.96 (0.92, 0.98) | 0.97 (0.94, 0.98) | 0.79 (0.75, 0.83) | 0.89 | < 0.0001 | < 0.0001 |
Dates in parentheses are 95% confidence intervals. p value for sensitivity represents the p value of sensitivity in comparison between algorithm and residents, p value for specificity represents the p value of specificity in comparison between algorithm and residents. PPV positive predictive value, NPV negative predictive value
Fig. 2Representative cases. Panels a and b. Chest CT images of a 53-year-old female diagnosed with COVID-19 pneumonia. a Axial unenhanced chest CT image and (b) corresponding output with deep learning algorithm show the multifocal subpleurally distributed GGOs with consolidation in the upper lobe of left lung (arrow). Panels c and d. Chest CT images of a 56-year-old female diagnosed with COVID-19 pneumonia. c Axial unenhanced chest CT image and (d) corresponding output with deep learning algorithm show the multifocal subpleurally distributed GGOs with crazy paving sign in both lungs (arrows)
Performance of deep learning model versus radiology residents based on anatomical structure
| Variables | Accuracy | Sensitivity | Specificity | PPV | NPV | F1 score | ||
|---|---|---|---|---|---|---|---|---|
| Right upper lobe | ||||||||
| Model | 0.83 (0.74, 0.90) | 0.96 (0.87, 0.99) | 0.66 (0.49, 0.79) | 0.79 (0.71, 0.85) | 0.93 (0.77, 0.98) | 0.87 | ||
| Resident 1 | 0.91 (0.83, 0.96) | 0.91 (0.80, 0.97) | 0.90 (0.77, 0.97) | 0.93 (0.83, 0.97) | 0.88 (0.76, 0.95) | 0.92 | 0.2413 | 0.0076 |
| Resident 2 | 0.93 (0.86, 0.97) | 0.93 (0.82, 0.98) | 0.93 (0.80, 0.98) | 0.94 (0.85, 0.98) | 0.90 (0.79, 0.96) | 0.94 | 0.4011 | 0.0027 |
| Resident 3 | 0.93 (0.86, 0.97) | 0.91 (0.80, 0.97) | 0.95 (0.83, 0.99) | 0.96 (0.87, 0.99) | 0.89 (0.77, 0.95) | 0.93 | 0.2413 | 0.0008 |
| Right middle lobe | ||||||||
| Model | 0.81 (0.72, 0.88) | 0.94 (0.83, 0.99) | 0.67 (0.52, 0.80) | 0.76 (0.67, 0.83) | 0.91 (0.77, 0.97) | 0.84 | ||
| Resident 1 | 0.85 (0.77, 0.92) | 0.82 (0.69, 0.91) | 0.89 (0.76, 0.96) | 0.89 (0.78, 0.95) | 0.82 (0.71, 0.89) | 0.85 | 0.0648 | 0.0115 |
| Resident 2 | 0.86 (0.78, 0.93) | 0.78 (0.64, 0.88) | 0.96 (0.85, 0.99) | 0.95 (0.83, 0.99) | 0.80 (0.70, 0.87) | 0.86 | 0.0211 | 0.0005 |
| Resident 3 | 0.84 (0.76, 0.91) | 0.76 (0.62, 0.87) | 0.93 (0.82, 0.99) | 0.93 (0.81, 0.97) | 0.78 (0.69, 0.86) | 0.84 | 0.0117 | 0.0016 |
| Right lower lobe | ||||||||
| Model | 0.77 (0.67, 0.85) | 0.98 (0.91, 1.00) | 0.46 (0.30, 0.63) | 0.73 (0.67, 0.78) | 0.95 (0.71, 0.99) | 0.84 | ||
| Resident 1 | 0.86 (0.78, 0.93) | 0.86 (0.74, 0.94) | 0.87 (0.73, 0.96) | 0.91 (0.81, 0.96) | 0.81 (0.69, 0.89) | 0.88 | 0.0151 | 0.0001 |
| Resident 2 | 0.86 (0.78, 0.93) | 0.84 (0.72, 0.93) | 0.90 (0.76, 0.97) | 0.92 (0.82, 0.97) | 0.80 (0.68, 0.88) | 0.88 | 0.0081 | < 0.0001 |
| Resident 3 | 0.86 (0.78, 0.93) | 0.79 (0.66, 0.89) | 0.97 (0.87, 1.00) | 0.98 (0.87, 1.00) | 0.76 (0.66, 0.84) | 0.87 | 0.0012 | < 0.0001 |
| Left upper lobe | ||||||||
| Model | 0.84 (0.76, 0.91) | 0.96 (0.87, 1.00) | 0.70 (0.54, 0.83) | 0.80 (0.71, 0.86) | 0.94 (0.79, 0.98) | 0.87 | ||
| Resident 1 | 0.89 (0.80, 0.94) | 0.89 (0.77, 0.96) | 0.88 (0.75, 0.96) | 0.90 (0.80, 0.96) | 0.86 (0.75, 0.93) | 0.90 | 0.1413 | 0.0340 |
| Resident 2 | 0.83 (0.74, 0.90) | 0.77 (0.64, 0.88) | 0.91 (0.78, 0.97) | 0.91 (0.80, 0.96) | 0.76 (0.66, 0.84) | 0.84 | 0.0041 | 0.0148 |
| Resident 3 | 0.88 (0.79, 0.93) | 0.79 (0.69, 0.87) | 0.98 (0.86, 1.00) | 0.97 (0.86, 1.00) | 0.79 (0.69, 0.87) | 0.88 | 0.0077 | 0.0005 |
| Left lower lobe | ||||||||
| Model | 0.85 (0.77, 0.92) | 0.97 (0.89, 1.00) | 0.61 (0.42, 0.78) | 0.84 (0.77, 0.89) | 0.90 (0.70, 0.97) | 0.90 | ||
| Resident 1 | 0.89 (0.80, 0.94) | 0.86 (0.75, 0.93) | 0.94 (0.79, 0.99) | 0.97 (0.88, 0.99) | 0.76 (0.64, 0.86) | 0.91 | 0.0274 | 0.0023 |
| Resident 2 | 0.89 (0.80, 0.94) | 0.86 (0.75, 0.93) | 0.94 (0.79, 0.99) | 0.97 (0.88, 0.99) | 0.76 (0.64, 0.86) | 0.91 | 0.0274 | 0.0023 |
| Resident 3 | 0.89 (0.80, 0.94) | 0.85 (0.74, 0.92) | 0.97 (0.83, 1.00) | 0.98 (0.89, 1.00) | 0.75 (0.63, 0.84) | 0.91 | 0.0154 | 0.0006 |
Dates in parentheses are 95% confidence intervals. p value for sensitivity represents the p value of sensitivity in comparison between algorithm and residents, p value for specificity represents the p value of specificity in comparison between algorithm and residents. PPV positive predictive value, NPV negative predictive value
Fig. 3Confusion matrix comparing CT severity grading performance between deep learning model and radiological residents
Running time comparison (unit in second)
| Variables | Mean | STD | Max | Min | |
|---|---|---|---|---|---|
| Deep leaning model | 20.3 | 5.8 | 38.9 | 10.5 | |
| Senior Radiologist | 82.7 | 17.5 | 108 | 52 | < 0.0001 |
| Resident 1 | 101.1 | 53.3 | 218 | 23 | < 0.0001 |
| Resident 2 | 68.3 | 18.5 | 100 | 32 | < 0.0001 |
| Resident 3 | 112.4 | 44.7 | 186 | 34 | < 0.0001 |
STD standard deviation
Fig. 4Receiver operating characteristic (ROC) diagram for AI system versus radiologists. The blue curve was created by taking different thresholds over the predicted probability, showing the macro-average AUC of AI system. The asterisk showed the performance of model in a balanced setting. The filled markers showed residents’ performance. Dashed line connected performance of radiologists with and without the assistance of AI system
Performance of residents with assistance of deep learning model
| Variables | Accuracy | Sensitivity | Specificity | PPV | NPV | F1 score | ||
|---|---|---|---|---|---|---|---|---|
| Patient level | ||||||||
| Resident 1 | 0.95 (0.88, 0.98) | 0.94 (0.87, 0.98) | 1.00 (0.63, 1.00) | 1.00 (0.95, 1.00) | 0.62 (0.41, 0.79) | 0.97 | ||
| Resident 1+ AI | 0.98 (0.92, 1.00) | 0.98 (0.92, 1.00) | 1.00 (0.63, 1.00) | 1.00 (0.95, 1.00) | 0.80 (0.48, 0.95) | 0.99 | 0.440 | 1.00 |
| Resident 2 | 0.92 (0.84, 0.96) | 0.93 (0.86, 0.97) | 0.75 (0.35, 0.97) | 0.98 (0.93, 0.99) | 0.50 (0.30, 0.70) | 0.95 | ||
| Resident 2+ AI | 0.96 (0.89, 0.99) | 0.97 (0.90, 0.99) | 0.88 (0.51, 1.00) | 0.99 (0.93, 1.00) | 0.70 (0.39, 0.90) | 0.98 | 0.494 | 0.519 |
| Resident 3 | 0.90 (0.82, 0.95) | 0.89 (0.80, 0.94) | 1.00 (0.63, 1.00) | 1.00 (0.94, 1.00) | 0.44 (0.31, 0.60) | 0.94 | ||
| Resident 3+ AI | 0.97 (0.91, 0.99) | 0.97 (0.90, 0.99) | 1.00 (0.63, 1.00) | 1.00 (0.95, 1.00) | 0.73 (0.43, 0.91) | 0.98 | 0.044 | 1.00 |
| Lung lobe level | ||||||||
| Resident 1 | 0.88 (0.85, 0.91) | 0.87 (0.82, 0.91) | 0.90 (0.84, 0.93) | 0.92 (0.89, 0.95) | 0.83 (0.78, 0.87) | 0.89 | ||
| Resident 1+ AI | 0.91 (0.89, 0.94) | 0.91 (0.86, 0.94) | 0.93 (0.88, 0.96) | 0.94 (0.91, 0.97) | 0.88 (0.83, 0.92) | 0.92 | 0.203 | 0.337 |
| Resident 2 | 0.88 (0.84, 0.90) | 0.84 (0.79, 0.88) | 0.93 (0.88, 0.96) | 0.94 (0.91, 0.96) | 0.80 (0.76, 0.84) | 0.89 | ||
| Resident 2+ AI | 0.94 (0.91, 0.96) | 0.94 (0.90, 0.96) | 0.94 (0.89, 0.96) | 0.95 (0.92, 0.97) | 0.91 (0.87, 0.95) | 0.94 | < 0.001 | 0.791 |
| Resident 3 | 0.88 (0.85, 0.91) | 0.82 (0.77, 0.86) | 0.96 (0.92, 0.98) | 0.97 (0.94, 0.98) | 0.79 (0.75, 0.83) | 0.89 | ||
| Resident 3+ AI | 0.91 (0.88, 0.93) | 0.88 (0.84, 0.91) | 0.95 (0.91, 0.97) | 0.96 (0.93, 0.98) | 0.86 (0.80, 0.90) | 0.92 | 0.053 | 0.668 |