| Literature DB >> 32444412 |
Shuo Wang1,2, Yunfei Zha3,2, Weimin Li4,2, Qingxia Wu5,2, Xiaohu Li6,2, Meng Niu7,2, Meiyun Wang8,2, Xiaoming Qiu9,2, Hongjun Li10,2, He Yu4, Wei Gong3, Yan Bai8, Li Li10, Yongbei Zhu1, Liusu Wang1, Jie Tian11,12,13.
Abstract
Coronavirus disease 2019 (COVID-19) has spread globally, and medical resources become insufficient in many regions. Fast diagnosis of COVID-19 and finding high-risk patients with worse prognosis for early prevention and medical resource optimisation is important. Here, we proposed a fully automatic deep learning system for COVID-19 diagnostic and prognostic analysis by routinely used computed tomography.We retrospectively collected 5372 patients with computed tomography images from seven cities or provinces. Firstly, 4106 patients with computed tomography images were used to pre-train the deep learning system, making it learn lung features. Following this, 1266 patients (924 with COVID-19 (471 had follow-up for >5 days) and 342 with other pneumonia) from six cities or provinces were enrolled to train and externally validate the performance of the deep learning system.In the four external validation sets, the deep learning system achieved good performance in identifying COVID-19 from other pneumonia (AUC 0.87 and 0.88, respectively) and viral pneumonia (AUC 0.86). Moreover, the deep learning system succeeded to stratify patients into high- and low-risk groups whose hospital-stay time had significant difference (p=0.013 and p=0.014, respectively). Without human assistance, the deep learning system automatically focused on abnormal areas that showed consistent characteristics with reported radiological findings.Deep learning provides a convenient tool for fast screening of COVID-19 and identifying potential high-risk patients, which may be helpful for medical resource optimisation and early prevention before patients show severe symptoms.Entities:
Mesh:
Year: 2020 PMID: 32444412 PMCID: PMC7243395 DOI: 10.1183/13993003.00775-2020
Source DB: PubMed Journal: Eur Respir J ISSN: 0903-1936 Impact factor: 16.671
FIGURE 1Datasets used in this study. A total of 5372 patients with computed tomography (CT) images from seven cities or provinces were enrolled in this study. The auxiliary training set included 4106 patients with lung cancer and epidermal growth factor receptor (EGFR) gene mutation status information, and is used to pre-train the COVID-19Net to learn lung features from CT images. The training set includes 709 patients from Wuhan city and Henan province. The external validation set 1 (226 patients) from Anhui province, and the external validation set 2 (161 patients) from Heilongjiang province are used to assess the diagnostic performance of the deep learning (DL) system. The external validation set 3 (53 patients with COVID-19) from Beijing, and the external validation set 4 (117 patients with COVID-19) from Huangshi city are used to evaluate the prognostic performance of the DL system.
FIGURE 2Illustration of the proposed deep learning (DL) system. Using the chest computed tomography (CT) scanning of a patient, the DL system predicts the probability the patient has COVID-19 and the prognosis of this patient directly without any human annotation. The DL system includes three parts: automatic lung segmentation (DenseNet121-FPN), non-lung area suppression, and COVID-19 diagnostic and prognostic analysis (COVID-19Net). To let the COVID-19Net learn lung features from the large dataset we used the auxiliary training process for pre-training, which trained the DL network to predict epidermal growth factor receptor (EGFR) gene mutation status using CT images of 4106 patients. The dense connection in this figure means each convolutional layer is connected to all of its previous convolutional layers inside the same dense block.
Clinical characteristics of patients
| 709 | 226 | 161 | 53 | 117 | |
| Wuhan city and Henan | Anhui | Heilongjiang | Beijing | Huangshi city | |
| COVID-19 | 560 | 102 | 92 | 53 | 117 |
| Bacterial pneumonia | 127 | 119 | 25 | 0 | 0 |
| Mycoplasma pneumonia | 11 | 5 | 15 | 0 | 0 |
| Viral pneumonia | 0 | 0 | 29 | 0 | 0 |
| Fungal pneumonia | 11 | 0 | 0 | 0 | 0 |
| Male | 337 | 131 | 108 | 25 | 60 |
| Female | 372 | 95 | 53 | 28 | 57 |
| 50.52±18.91 | 49.15±18.44 | 58.44±16.19 | 50.26±19.29 | 47.67±14.20 | |
| Any | 204 | NA | NA | 16 | 27 |
| Diabetes | 45 | 2 | 12 | ||
| Hypertension | 120 | 10 | 12 | ||
| Cerebrovascular disease | 18 | 1 | 0 | ||
| Cardiovascular disease | 21 | 5 | 9 | ||
| Malignancy | 19 | 0 | 1 | ||
| COPD | 10 | 1 | 2 | ||
| Pulmonary tuberculosis | 6 | 1 | 0 | ||
| Chronic kidney disease | 10 | 0 | 2 | ||
| Chronic liver disease | 16 | 3 | 2 | ||
| 301 | NA | NA | 53 | 117 |
Data are presented as n or mean±sd. COVID: coronavirus 2019: NA: not available.
Diagnostic performance of the deep learning system
| 709 | 226 | 161 | 121 | |
| 0.90 (0.89–0.91) | 0.87 (0.86–0.89) | 0.88 (0.86–0.90) | 0.86 (0.83–0.89) | |
| 81.24 | 78.32 | 80.12 | 85.00 | |
| 78.93 | 80.39 | 79.35 | 79.35 | |
| 89.93 | 76.61 | 81.16 | 71.43 | |
| 86.92 | 77.00 | 82.02 | 90.11 |
AUC: area under the receiver operating characteristic curve. : a stratified analysis using the patients with coronavirus 2019 and viral pneumonia in validation set 2.
FIGURE 3Diagnostic performance of the deep learning (DL) system. a) Receiver operating characteristic curves of the DL system in the training set and the two independent external validation sets. Validation 2-viral is a stratified analysis using the patients with coronavirus 2019 and viral pneumonia in the validation set 2. b) Calibration curves of the DL system in the two external validation sets. c) Area under the curve and distribution of the training set and the two external validation sets.
FIGURE 4Deep learning (DL) discovered suspicious lung area. a–p) Computed tomography (CT) images of eight patients with coronavirus 2019. a–d and i–l) CT images of the patients (these CT images are processed by the DL system). e–h and m–p) Heat maps of the DL discovered suspicious lung area. In the heat map, areas with bright red colour are more important than dark blue areas.
FIGURE 5Deep learning (DL) feature visualisation. a–d) Four 3-dimensional (3D) convolutional filters from different convolutional layers. e) Distribution of patients in the 64-dimensional DL feature space. For display convenience, the 64-dimensional DL feature space is reduced to 2-dimensional by a principle component analysis algorithm.