| Literature DB >> 34865622 |
Wei Li1, Yangyong Cao2, Kun Yu3, Yibo Cai4, Feng Huang5, Minglei Yang5, Weidong Xie6.
Abstract
BACKGROUND: The COVID-19 disease is putting unprecedented pressure on the global healthcare system. The CT (computed tomography) examination as a auxiliary confirmed diagnostic method can help clinicians quickly detect lesions locations of COVID-19 once screening by PCR test. Furthermore, the lesion subtypes classification plays a critical role in the consequent treatment decision. Identifying the subtypes of lesions accurately can help doctors discover changes in lesions in time and better assess the severity of COVID-19.Entities:
Keywords: 3D texture feature; COVID-19; Hybrid adaptive feature selection; Lesion subtypes; Radiomics; Random forest
Mesh:
Year: 2021 PMID: 34865622 PMCID: PMC8645296 DOI: 10.1186/s12938-021-00961-w
Source DB: PubMed Journal: Biomed Eng Online ISSN: 1475-925X Impact factor: 2.819
Performance of COVID-19 classification achieved by SVM, KNN, LR, GaussianNB, QDA, RF, HAFS-RF ()
| Method | Label | Precision (%) | Recall (%) | Accuracy (%) | |
|---|---|---|---|---|---|
| SVM | 1 | 76.34 | 82.3 | 86.37 | |
| 2 | 99.48 | 62.07 | 92.31% | 76.45 | |
| 3 | 57.75 | 98.04 | 73.21 | ||
| 4 | 58.2 | 91.75 | 72.71 | ||
| KNN | 1 | 88.04 | 86.32 | 85.66 | 87.17 |
| 2 | 83.09 | 83.23 | 93.19 | 83.16 | |
| 3 | 78.05 | 86.49 | 98.17 | 82.05 | |
| 4 | 65.98 | 67.96 | 87.58 | 66.96 | |
| LR | 1 | 83.46 | 88.4 | 83.8 | 85.86 |
| 2 | 76.93 | 75.3 | 89.83 | 76.11 | |
| 3 | 66.67 | 52.11 | 96.58 | 58.5 | |
| 4 | 55.86 | 50.27 | 83.7 | 52.92 | |
| GaussianNB | 1 | 88.57 | 59.6 | 72.88 | 71.25 |
| 2 | 42.62 | 62.72 | 75.52 | 50.75 | |
| 3 | 14.64 | 86.62 | 76.01 | 25.05 | |
| 4 | 37.82 | 10.19 | 79.89 | 16.05 | |
| QDA | 1 | 36.67 | 63.72 | 52.94 | |
| 2 | 39.1 | 66.82 | 55.78 | ||
| 3 | |||||
| 4 | 43.38 | 48.66 | 79.07 | 45.87 | |
| DT | 1 | 91.85 | 92.57 | 91.49 | 92.21 |
| 2 | 91.53 | 87.21 | 95.53 | 89.32 | |
| 3 | 86.75 | 90.34 | 98.89 | 88.51 | |
| 4 | 78.36 | 79.93 | 91.79 | 79.14 | |
| RF | 1 | 89.70 | 93.31 | 90.42 | 91.47 |
| 2 | 84.92 | 83.59 | 93.48 | 84.25 | |
| 3 | 86.11 | 87.94 | 98.79 | 87.02 | |
| 4 | 80.78 | 72.53 | 91.30 | 76.43 | |
| HAFS-RF (our) | 1 | 92.21 | 95.52 | ||
| 2 | 93.17 | 91.58 | |||
| 3 | 95.14 | 95.8 | 99.58 | 95.47 | |
| 4 | 88.43 |
Bold values indicate the maximum value of each type of lesion classification index
Fig. 1ROC curves achieved by different models
Performance of COVID-19 classification achieved with data augmentation
| Augmentation | Label | Number | Precision (%) | Recall (%) | Accuracy (%) | |
|---|---|---|---|---|---|---|
| With | 1 | 2637 | 93.17 | 96.85 | 92.95 | 94.97 |
| 2 | 519 | 89.84 | 86.02 | 96.88 | 87.89 | |
| 3 | 103 | 89.47 | 77.27 | 99.16 | 82.93 | |
| 4 | 475 | 82.94 | 73.25 | 93.55 | 77.79 | |
| Without | 1 | 2637 | 92.21 | 95.52 | 93.06 | 93.84 |
| 2 | 1098 | 93.17 | 91.58 | 96.84 | 92.37 | |
| 3 | 386 | 95.14 | 95.8 | 99.58 | 95.47 | |
| 4 | 976 | 88.43 | 80.75 | 94.3 | 84.42 |
Fig. 2Scores of 189 features
The features after the process of HAFS
| Dimension of features | Kind of features | Characteristics |
|---|---|---|
| 2D | First order | Length, Mean, Max, Var, ASM, Energy |
| 3D | First order | Robust mean absolute deviation, Mean, Root mean squared, Range, Interquartile range, Skewness |
| Glszm | Gray-level variance, High gray-level zone emphasis, Zone percentage, Small area low gray-level emphasis | |
| Glrlm | Long-run high gray-level emphasis, Difference variance, Gray-level nonuniformity normalized, Run percentage | |
| Glcm | Sum squares, Id, Joint average | |
| Gldm | Dependence nonuniformity normalized, Dependence entropy, Dependence entropy | |
| Shape | Major axis length |
Performance of COVID-19 classification achieved by using 2D features and using 2D and 3D features
| Feature | Label | Precision (%) | Recall (%) | Accuracy (%) | |
|---|---|---|---|---|---|
| 2D | 1 | 88.38 | 93.56 | 89.37 | 90.89 |
| 2 | 84.18 | 81.9 | 93.12 | 83.02 | |
| 3 | 85.94 | 79.14 | 98.47 | 82.4 | |
| 4 | 79.84 | 69.29 | 91.2 | 74.19 | |
| 2D and 3D | 1 | 92.21 | 95.52 | 93.06 | 93.84 |
| 2 | 93.17 | 91.58 | 96.84 | 92.37 | |
| 3 | 95.14 | 95.8 | 99.58 | 95.47 | |
| 4 | 88.43 | 80.75 | 94.3 | 84.42 |
Performance of different feature selection algorithm achieved by F-test, MIC, RFE, Lasso, HAFS () using Random Forest
| Method | Label | Precision (%) | Recall (%) | Accuracy (%) | |
|---|---|---|---|---|---|
| 1 | 86.59 | 91.66 | 87.32 | 89.05 | |
| 2 | 78.32 | 71.99 | 90.51 | 75.02 | |
| 3 | 74.62 | 64.67 | 97.2 | 69.29 | |
| 4 | 71.43 | 67.52 | 88.66 | 69.42 | |
| MIC | 1 | 87.4 | 93.09 | 88.69 | 90.16 |
| 2 | 76.75 | 75.04 | 90.91 | 75.89 | |
| 3 | 88.0 | 70.06 | 97.98 | 78.01 | |
| 4 | 73.42 | 65.59 | 88.27 | 69.28 | |
| RFE | 1 | 84.82 | 93.32 | 86.99 | 88.87 |
| 2 | 81.29 | 77.26 | 92.28 | 79.23 | |
| 3 | 88.07 | 61.15 | 97.59 | 72.18 | |
| 4 | 75.43 | 63.97 | 88.53 | 69.23 | |
| Lasso | 1 | 87.69 | 93.44 | 89.05 | 90.47 |
| 2 | 77.84 | 73.85 | 91.0 | 75.79 | |
| 3 | 80.45 | 68.15 | 97.52 | 73.79 | |
| 4 | 74.69 | 67.69 | 88.85 | 71.02 | |
| HAFS | 1 | ||||
| 2 | |||||
| 3 | |||||
| 4 |
Bold values indicate the maximum value of each type of lesion classification index
Performance of HAFS achieved by SVM, KNN, GaussianNB and QDA by using and not using HAFS
| Method | Label | Precision (%) | Recall (%) | Accuracy (%) | ||
|---|---|---|---|---|---|---|
| SVM | 1 | 76.34 | 99.42 | 82.3 | 86.37 | |
| 2 | 99.48 | 62.07 | 92.31 | 76.45 | ||
| 3 | 100.0 | 57.75 | 98.04 | 73.21 | ||
| 4 | 96.84 | 58.2 | 91.75 | 72.71 | ||
| HAFS-SVM | 0.1 | 1 | 90.08 | 95.03 | 91.3 | 92.49 |
| 2 | 91.64 | 87.03 | 95.8 | 89.28 | ||
| 3 | 99.1 | 77.46 | 98.92 | 86.96 | ||
| 4 | 84.98 | 80.14 | 93.58 | 82.49 | ||
| KNN | 1 | 88.04 | 86.32 | 85.66 | 87.17 | |
| 2 | 83.09 | 83.23 | 93.19 | 83.16 | ||
| 3 | 78.05 | 86.49 | 98.17 | 82.05 | ||
| 4 | 65.98 | 67.96 | 87.58 | 66.96 | ||
| HAFS-KNN | 0.5 | 1 | 89.01 | 87.01 | 86.6 | 88.0 |
| 2 | 83.17 | 84.52 | 93.42 | 83.84 | ||
| 3 | 80.77 | 85.14 | 98.31 | 82.89 | ||
| 4 | 66.89 | 69.37 | 87.97 | 68.11 | ||
| GaussianNB | 1 | 88.57 | 59.6 | 72.88 | 71.25 | |
| 2 | 42.62 | 62.72 | 75.52 | 50.75 | ||
| 3 | 14.64 | 86.62 | 76.01 | 25.05 | ||
| 4 | 37.82 | 10.19 | 79.89 | 16.05 | ||
| HAFS-GaussianNB | 0.1 | 1 | 81.77 | 77.8 | 77.71 | 79.74 |
| 2 | 46.21 | 51.38 | 78.19 | 48.66 | ||
| 3 | 34.96 | 55.63 | 93.16 | 42.93 | ||
| 4 | 46.67 | 41.11 | 80.02 | 43.71 | ||
| QDA | 1 | 95.14 | 36.67 | 63.72 | 52.94 | |
| 2 | 39.1 | 97.27 | 66.82 | 55.78 | ||
| 3 | 100.0 | 99.3 | 99.97 | 99.65 | ||
| 4 | 43.38 | 48.66 | 79.07 | 45.87 | ||
| HAFS-QDA | 0.3 | 1 | 86.07 | 82.19 | 82.69 | 84.09 |
| 2 | 60.85 | 82.88 | 84.84 | 70.17 | ||
| 3 | 58.93 | 92.96 | 96.68 | 72.13 | ||
| 4 | 56.51 | 31.84 | 83.12 | 40.73 |
Fig. 3Typical four lesion subtypes in CT images of COVID-19. The labels and regions are given by medical experts. The red area represents ground-glass opacity, the green area represents cord, the blue area represents solid and the yellow area represents subsolid
Samples of lesion from prepared dataset COVID-19
| Label | Num | Statistics | Volume | |||
|---|---|---|---|---|---|---|
| 1 | 2637 | Max | 247.0 | 312.0 | 399.0 | 20148480.0 |
| Mean | 46.78 | 49.2 | 24.84 | 395826.08 | ||
| Std | 42.2 | 50.68 | 39.63 | 1596465.23 | ||
| 2 | 519 | Max | 186.0 | 205.0 | 310.0 | 5142630.0 |
| Mean | 43.38 | 41.28 | 24.86 | 124041.76 | ||
| Std | 31.4 | 27.92 | 28.17 | 409901.04 | ||
| 3 | 103 | Max | 217.0 | 301.0 | 223.0 | 5878530.0 |
| Mean | 40.46 | 37.9 | 21.53 | 210307.5 | ||
| Std | 46.06 | 45.99 | 26.49 | 752977.45 | ||
| 4 | 475 | Max | 204.0 | 283.0 | 378.0 | 16873920.0 |
| Mean | 61.9 | 63.26 | 38.79 | 722428.14 | ||
| Std | 50.65 | 57.84 | 53.43 | 1979445.47 |
Fig. 4The flowchart of our algorithm
Fig. 5Overview of the COVID-19 classification using random forest based on hybrid adaptive feature selection
Fig. 6A schematic illustration of first stage of HAFS