| Literature DB >> 36248019 |
Alejandro Murillo-González1, David González2, Laura Jaramillo2, Carlos Galeano2, Fabby Tavera3, Marcia Mejía3, Alejandro Hernández4, David Restrepo Rivera1, J G Paniagua5, Leandro Ariza-Jiménez1, José Julián Garcés Echeverri1, Christian Andrés Diaz León1, Diana Lucia Serna-Higuita1, Wayner Barrios6, Wiston Arrázola1, Miguel Ángel Mejía1, Sebastián Arango1, Daniela Marín Ramírez1, Emmanuel Salinas-Miranda7, O L Quintero1.
Abstract
Purpose: Determination and development of an effective set of models leveraging Artificial Intelligence techniques to generate a system able to support clinical practitioners working with COVID-19 patients. It involves a pipeline including classification, lung and lesion segmentation, as well as lesion quantification of axial lung CT studies. Approach: A deep neural network architecture based on DenseNet is introduced for the classification of weakly-labeled, variable-sized (and possibly sparse) axial lung CT scans. The models are trained and tested on aggregated, publicly available data sets with over 10 categories. To further assess the models, a data set was collected from multiple medical institutions in Colombia, which includes healthy, COVID-19 and patients with other diseases. It is composed of 1,322 CT studies from a diverse set of CT machines and institutions that make over 550,000 slices. Each CT study was labeled based on a clinical test, and no per-slice annotation took place. This enabled a classification into Normal vs. Abnormal patients, and for those that were considered abnormal, an extra classification step into Abnormal (other diseases) vs. COVID-19. Additionally, the pipeline features a methodology to segment and quantify lesions of COVID-19 patients on the complete CT study, enabling easier localization and progress tracking. Moreover, multiple ablation studies were performed to appropriately assess the elements composing the classification pipeline.Entities:
Keywords: computational tomography; lesion segmentation / quantification; machine learning; volume classification; weak-labels
Year: 2022 PMID: 36248019 PMCID: PMC9554434 DOI: 10.3389/fmedt.2022.980735
Source DB: PubMed Journal: Front Med Technol ISSN: 2673-3129
Performance comparison among COVID-19 lung CT classification models, as reported by the authors, on their test sets. Where multiple classifiers were tested, the results without data augmentation are reported.
| Model | Accuracy (%) | Sensitivity (%) | Specificity (%) | Labeled slices | Support |
|---|---|---|---|---|---|
| ( | 90.11 | 87.03 | 96.60 | Yes | 3,199 |
| ( | 96.00 |
| 96.00 | Yes | 1,100 |
| ( |
| 99.58 |
| Yes | 799 |
| ( | 79.30 | 67.00 | 83.00 | Yes | 290 |
| ( | 96.00 | 94.00 | 98.00 | Yes | 270 |
| ( | 98.00 | 94.96 | 98.70 | Yes | 245 |
| ( | 85.00 | 85.40 | 85.70 | Yes | 203 |
| ( | 90.23 | 91.18 | 89.23 | Yes | 133 |
| ( | 96.00 | 95.00 | 96.00 | Yes | 119 |
| ( | 86.70 | 86.60 | 86.80 | Yes | 90 |
| ( | 81.00 | 80.20 | 82.60 |
| 45 |
Figure 1High-level overview of the Decision Support System.
Figure 2Lung CT classification pipeline. It takes stacks of successive slices from the CT volume, and inputs them to the proposed ChexNet3D. Before that, the study is fed to U-Net to select the slices that contain lung portions and segment around the lung’s bounding box.
Figure 4Classification results on the public test set. Confusion matrices for (A) healthy vs. unhealthy and (B) other-diseases vs. COVID-19 classifiers. (C) Per-class data distribution.
Figure 3Data distribution of the CT studies collected from Colombian medical institutions.
Performance comparison, on test data, of weakly-labeled lung CT classification tasks, using the proposed architecture (ChexNet3D).
| Model | Accuracy (%) | Sensitivity (%) | Specificity (%) | F1 score (%) | Precision (%) | Support |
|---|---|---|---|---|---|---|
| Healthy vs. unhealthy | 71.00 | 67.00 | 76.00 | 71.00 | 75.00 | 1,372 |
| Other diseases vs. COVID-19 | 75.00 | 65.00 | 82.00 | 68.00 | 71.00 | 1,445 |
| Healthy vs. unhealthy | 83.00 | 79.00 | 87.00 | 82.00 | 85.00 | 117 |
| Other diseases vs. COVID-19 | 86.00 | 81.00 | 91.00 | 84.00 | 88.00 | 122 |
| Healthy vs. unhealthy | 84.00 | 92.00 | 74.00 | 85.00 | 79.00 | 364 |
| Other diseases vs. COVID-19 | 90.00 | 99.00 | 81.00 | 91.00 | 84.00 | 759 |
The Supplementary Material contains tables and figures with additional relevant performance metrics and figures.
Results on hospitals data set with accuracy-per-stack metric.
Results on hospitals data set with stacks-average metric.
Results on the model pre-trained with public data using the at-least-one metric.
Figure 5Classification results on the Colombian medical institutions test set. Confusion matrices for the (A) healthy vs. unhealthy classifier with the accuracy-per-stack metric, (B) other diseases vs. COVID-19 classifier with the accuracy-per-stack metric, (C) healthy vs. unhealthy classifier with the stacks-average metric, and (D) other diseases vs. COVID-19 classifier with the stacks-average metric.
Figure 6Axial CT study of four randomly sampled patients with COVID-19. Columns (A), (C), (E)/ (G) show the original images (columns C, E, / G segmented). Columns (B), (D), (F) / (H) show the bounding-box and lesion segmentation of the image at its left. Rows (1), (2) / (3) are different slices of the same patient at the start, middle and end of the study, respectively. The lesion quantification model determines the approximate lesion proportion, from top to bottom, as 59.11%, 56.49% / 11.30% for the figures on column (B); as 7.36%, 13.29% / 7.55% for those on column (D); as 15.61%, 40.72% / 38.11% for those on column (F); and as 11.73%, 27.91% / 79.97% for the slices on column (H). Note the patient on column (C), where the top and bottom slices show very little or no sign of lesions. That is the reason why it is necessary to process the complete CT volume. The Supplementary Material contains the fully processed studies for the four patients presented above.
Ablation studies results.
| Model | Ablation | Accuracy (%) | Sensitivity (%) | Specificity (%) | Support |
|---|---|---|---|---|---|
| Healthy vs. unhealthy | Segmentation | 71.00 | 74.00 | 68.00 | 872 |
| Healthy vs. unhealthy | Segmentation | 74.00 | 81.00 | 67.00 | 89 |
| Other diseases vs. COVID-19 | Segmentation | 69.00 | 70.00 | 68.00 | 340 |
| Other diseases vs. COVID-19 | Segmentation | 74.00 | 68.00 | 81.00 | 43 |
| Healthy vs. unhealthy | 3D Convolution | 68.00 | 72.00 | 63.00 | 1,372 |
| Healthy vs. unhealthy | 3D Convolution | 78.00 | 81.00 | 75.00 | 117 |
| Other diseases vs. COVID-19 | 3D Convolution | 68.00 | 66.00 | 68.00 | 1,445 |
| Other diseases vs. COVID-19 | 3D Convolution | 76.00 | 84.00 | 69.00 | 122 |
| Healthy vs. unhealthy | Reduced Volume | 72.00 | 66.00 | 79.00 | 1,184 |
| Healthy vs. unhealthy | Reduced Volume | 83.00 | 75.00 | 90.00 | 117 |
| Other diseases vs. COVID-19 | Reduced Volume | 75.00 | 65.00 | 82.00 | 1,265 |
| Other diseases vs. COVID-19 | Reduced Volume | 86.00 | 81.00 | 91.00 | 122 |
| Healthy vs. unhealthy | 3D Conv. + Red. Volume | 69.00 | 68.00 | 70.00 | 1,184 |
| Healthy vs. unhealthy | 3D Conv. + Red. Volume | 75.00 | 77.00 | 73.00 | 117 |
| Other diseases vs. COVID-19 | 3D Conv. + Red. Volume | 70.00 | 69.00 | 71.00 | 1,265 |
| Other diseases vs. COVID-19 | 3D Conv. + Red. Volume | 77.00 | 86.00 | 69.00 | 122 |
Performance comparison of the classifier when different elements of the classification pipeline are replaced.
Results on hospitals data set with accuracy-per-stack metric.
Results on hospitals data set with stacks-average metric.