| Literature DB >> 35464345 |
Nasrin Amini1, Ahmad Shalbaf1.
Abstract
Severity assessment of the novel Coronavirus (COVID-19) using chest computed tomography (CT) scan is crucial for the effective administration of the right therapeutic drugs and also for monitoring the progression of the disease. However, determining the severity of COVID-19 needs a highly expert radiologist by visual assessment, which is time-consuming, boring, and subjective. This article introduces an advanced machine learning tool to determine the severity of COVID-19 to mild, moderate, and severe from the lung CT images. We have used a set of quantitative first- and second-order statistical texture features from each image. The first-order texture features extracted from the image histogram are variance, skewness, and kurtosis. The second-order texture features extraction methods are gray-level co-occurrence matrix, gray-level run length matrix, and gray-level size zone matrix. Finally, using the extracted features, CT images of each person are classified using random forest (RF) as an ensemble method based on majority voting of the decision trees outputs to four classes. We have used a dataset of CT scans labeled as being normal (231), mild (563), moderate (120), and severe (42) determined by expert radiologists. The experimental results indicate the combination of all feature extraction methods, and RF achieves the highest result compared with the other strategies in detecting the four classes of severity of COVID-19 from CT images with an accuracy of 90.95%. This proposed system can work well and can be used as an assistant diagnostic tool for quantification of lung involvement of COVID-19 to monitor the progression of the disease.Entities:
Keywords: computed tomography; random forest; severity of COVID‐19; texture features
Year: 2021 PMID: 35464345 PMCID: PMC9015452 DOI: 10.1002/ima.22679
Source DB: PubMed Journal: Int J Imaging Syst Technol ISSN: 0899-9457 Impact factor: 2.177
Clinical information of 956 patients (males—42%, females—56%, other/unknown—2%) with age from 18 to 97 years, median: 47 years, max: 97 years old (respiratory rate [RR], temperature [t], oxygen saturation [spo2])
| Severity | Clinical data |
|---|---|
| Class 1: normal range (231 cases) | Normal and no pulmonary parenchymal involvement |
| Class 2: mild (563 cases) | t < 38°c, RR < 20/min, spo2 > 95, percent of pulmonary parenchymal involvement = <25% |
| Class 3: Moderate (120 cases) | t < 38.5°c, RR 20–30/min, spo2: 95, percent of pulmonary parenchymal involvement = 25%–50% |
| Class 4: Severe and critical (42 cases) | (t > 38.5°c, RR > 30/min, spo2 < 95 (signs of shock, multiple organ failure, respiratory failure.) percent of pulmonary parenchymal involvement > = 75% |
FIGURE 1Lung computed tomography (CT) slices of normal, mild, moderate, and severe COVID‐19 cases
FIGURE 2Classification process for computed tomography (CT) images to normal, mild, moderate, and severe COVID‐19
The results of mean classification accuracy for eightfold cross‐validation
| Classification with feature extraction method | Normal | Mild | Moderate | Severe | Four classes |
|---|---|---|---|---|---|
| LDA with GLCM | 89.2 | 86.64 | 28.57 | 80 | 79.65 |
| LDA with GLRLMS | 93.23 | 92.85 | 33.84 | 100 | 85.81 |
| LDA with GLSZM | 84.87 | 86.96 | 33.77 | 80 | 79.50 |
| LDA with Global | 86.16 | 87.14 | 26.28 | 80 | 78.92 |
| KNN with GLCM | 86.2 | 85.76 | 28.87 | 80 | 78.45 |
| KNN with GLRLMS | 93.19 | 93.46 | 38.89 | 100 | 86.80 |
| KNN with GLSZM | 80.14 | 85.38 | 26.66 | 80 | 76.47 |
| KNN with Global | 86.54 | 84.64 | 26.32 | 80 | 77.55 |
| RF with GLCM | 86.33 | 87.88 | 26.22 | 80 | 79.39 |
|
| 95.1 | 98.2 | 28.57 | 100 |
|
| RF with GLSZM | 79.31 | 81.79 | 24.32 | 80 | 73.86 |
| RF with Global | 89.95 | 85.97 | 28.47 | 80 | 79.44 |
| LDA with GLCM + GLRLMS + GLSZM + Global | 93.1 | 97.14 | 33.33 | 80 | 87.23 |
| KNN with GLCM + GLRLMS + GLSZM + Global | 95.49 | 92.98 | 26.98 | 80 | 84.73 |
|
| 96.6 | 98.88 | 40 | 100 |
|
Abbreviations: KNN, k‐nearest neighbor; LDA, linear discriminant analysis; GLCM, gray‐level co‐occurrence matrix; RF, random forest.
Note: The significance of bold values are the best‐obtained results.
The results of mean classification accuracy for fivefold, eightfold, and 10‐fold cross‐validation
| Classification with feature extraction method | Fivefold | Eightfold | 10‐fold |
|---|---|---|---|
| LDA with GLCM | 61.34 | 79.65 | 79.28 |
| LDA with GLRLMS | 68.07 | 85.81 | 85.23 |
| LDA with GLSZM | 58.82 | 79.50 | 79.45 |
| LDA with Global | 57.48 | 78.92 | 79.03 |
| KNN with GLCM | 62.86 | 78.45 | 78.73 |
| KNN with GLRLMS | 67.65 | 86.80 | 86.48 |
| KNN with GLSZM | 61.18 | 76.47 | 76.28 |
| KNN with Global | 61.93 | 77.55 | 77.36 |
| RF with GLCM | 68.45 | 79.39 | 79.12 |
|
|
|
|
|
| RF with GLSZM | 60.33 | 73.86 | 73.80 |
| RF with Global | 63.53 | 79.44 | 79.45 |
| LDA with GLCM + GLRLMS + GLSZM + Global | 70.98 | 87.23 | 87.01 |
| KNN with GLCM + GLRLMS + GLSZM + Global | 69.14 | 84.73 | 84.62 |
|
|
|
|
|
Abbreviations: KNN, k‐nearest neighbor; LDA, linear discriminant analysis; GLCM, gray‐level co‐occurrence matrix; RF, random forest.
Note: The significance of bold values are the best‐obtained results.
FIGURE 3The results of selecting the best number of trees in RF classifier with all combinations of features. The optimum feature number is 40
Comparison of estimation of labels provided by the proposed method (RF classifier with all combination of features) against those assigned by the reference one (highly experienced radiologist)
| COVID‐19 | Estimated labels by the proposed method | ||||
|---|---|---|---|---|---|
| Normal | Mild | Moderate | Severe | ||
| Reference | Normal | 28 | 1 | 0 | 0 |
| Mild | 1 | 69 | 0 | 0 | |
| Moderate | 0 | 9 | 6 | 0 | |
| Severe | 0 | 0 | 0 | 5 | |
Comparison of classification results of our work with other studies in the classification of severity of COVID‐19 patients from CT images
| Study | Methods or features | Results |
|---|---|---|
| Huang et al. | Affected lung percentage by 2D U‐Net deep learning | N/A |
| Shan et al. | quantification of infection regions by Deep learning | Accuracy = 85.1% |
| Tang et al. | Infection volume of the whole lung and the volume of GGO and RF probabilities | Accuracy = 87.5% |
| Shen et al. | Affected lung percentage by nontrainable CV | N/A |
| Our work | RF with GLCM + GLRLMS + GLSZM + Global | Accuracy = 90.95% |