| Literature DB >> 33411162 |
Jiajun Qiu1, Shaoliang Peng2, Jin Yin1, Junren Wang1, Jingwen Jiang1, Zhenlin Li3, Huan Song4, Wei Zhang1.
Abstract
Assessing pulmonary lesions using computed tomography (CT) images is of great significance to the severity diagnosis and treatment of coronavirus disease 2019 (COVID-19)-infected patients. Such assessment mainly depends on radiologists' subjective judgment, which is inefficient and presents difficulty for those with low levels of experience, especially in rural areas. This work focuses on developing a radiomics signature to quantitatively analyze whether COVID-19-infected pulmonary lesions are mild (Grade I) or moderate/severe (Grade II). We retrospectively analyzed 1160 COVID-19-infected pulmonary lesions from 16 hospitals. First, texture features were extracted from the pulmonary lesion regions of CT images. Then, feature preselection was performed and a radiomics signature was built using a stepwise logistic regression. The stepwise logistic regression also calculated the correlation between the radiomics signature and the grade of a pulmonary lesion. Finally, a logistic regression model was trained to classify the grades of pulmonary lesions. Given a significance level of α = 0.001, the stepwise logistic regression achieved an R (multiple correlation coefficient) of 0.70, which is much larger than Rα = 0.18 (the critical value of R). In the classification, the logistic regression model achieved an AUC of 0.87 on an independent test set. Overall, the radiomics signature is significantly correlated with the grade of a pulmonary lesion in COVID-19 infection. The classification model is interpretable and can assist radiologists in quickly and efficiently diagnosing pulmonary lesions. This work aims to develop a CT-based radiomics signature to quantitatively analyze whether COVID-19-infected pulmonary lesions are mild (Grade I) or moderate/severe (Grade II). The logistic regression model established based on this radiomics signature can assist radiologists to quickly and efficiently diagnose the grades of pulmonary lesions. The model calculates a radiomics score for a lesion and is interpretable and appropriate for clinical use.Entities:
Keywords: COVID-19; Pulmonary lesion; Quantitative assessment; Radiomics signature
Mesh:
Year: 2021 PMID: 33411162 PMCID: PMC7788548 DOI: 10.1007/s12539-020-00410-7
Source DB: PubMed Journal: Interdiscip Sci ISSN: 1867-1462 Impact factor: 2.233
Fig. 1Framework of this work: steps A–E will be described in detail in subsections
Fig. 2Inclusion and exclusion of patients and acquisition of ROIs. Multiple bounding boxes with overlapping areas were defined as a lesion region. For a lesion region, the bounding box with the largest area was selected and regarded as the ROI
Fig. 3Examples of delineating bounding boxes. a An example with a mild bounding box (Grade I); the patient’s age was 43 years, female. b An example with a moderate bounding box (Grade II); the patient’s age was 36 years, male. c An example with a severe bounding box (Grade II); the patient’s age was 57 years, male
Features of the radiomics signature: COM (co-occurrence matrix); RLM (run-length matrix); CS (coefficient statistics)
| No. | Method | Component | Feature name |
|---|---|---|---|
| Wavelet | The approximate component in the 1st-level decomposition | Homogeneity in the COM at d = 1 | |
| Wavelet | The horizontal component in the 1st-level decomposition | Correlation in the COM at | |
| Wavelet | The horizontal component in the 2nd-level decomposition | Run percentage in the RLM | |
| Contourlet | The approximate component | Mean in the CS | |
| Contourlet | The approximate component | Contrast in the COM at | |
| Contourlet | The 2nd component in the 2nd-level decomposition | Percentage of 0.01 in the histogram | |
| Contourlet | The 1nd component in the 1st-level decomposition | Percentage of 0.01 in the histogram | |
| Contourlet | The 2nd component in the 1st-level decomposition | Kurtosis in the histogram | |
| Contourlet | The 4nd component in the 1st-level decomposition | Percentage of 0.01 in the histogram |
Results of coefficient estimation: SE (standard error)
| Estimate | Confidence intervals ( | SE | |||
|---|---|---|---|---|---|
| − 30.40 | [− 38.84, − 21.95] | 4.30 | − 7.07 | 1.58 × 10−12 | |
| 1.96 | [1.13, 2.80] | 0.43 | 4.60 | 4.19 × 10−06 | |
| − 3.79 | [− 6.65, − 0.92] | 1.46 | − 2.59 | 9.52 × 10−03 | |
| 20.519 | [7.34, 33.67] | 6.70 | 3.06 | 2.21 × 10−03 | |
| − 1.21 × 10−03 | [− 1.70 × 10−03, − 7.00 × 10−04] | 2.33 × 10−04 | − 5.18 | 2.24 × 10−07 | |
| 1.16 × 10−03 | [8.00 × 10−04, 1.50 × 10−03] | 1.82 × 10−04 | 6.34 | 2.29 × 10−10 | |
| 3.159 | [− 0.66, 6.98] | 1.94 | 1.62 | 0.10 | |
| 1.71 | [0.22, 3.20] | 0.76 | 2.25 | 0.02 | |
| 0.02 | [3.70 × 10−03, 0.04] | 9.64 × 10−03 | 2.35 | 0.02 | |
| 2.54 | [0.92, 4.16] | 0.83 | 3.07 | 2.00 × 10−3 |
Fig. 4Results of ROC curves and AUCs in the classification. The validation ROC curve and its corresponding AUC value shown in the figure refer to the average performance
Fig. 5Sensitivity values and specificity values as the threshold varies. High sensitivity values and high specificity values appear simultaneously when the threshold is varied from 0.7 to 0.8
Classification results of the logistic regression model as the threshold varies
| Threshold | Tenfold cross-validation | Test | ||||
|---|---|---|---|---|---|---|
| Accuracy | Sensitivity | Specificity | Accuracy | Sensitivity | Specificity | |
| 0.50 | 0.879 | 0.953 | 0.611 | 0.839 | 0.916 | 0.560 |
| 0.55 | 0.877 | 0.940 | 0.646 | 0.842 | 0.912 | 0.587 |
| 0.60 | 0.872 | 0.926 | 0.674 | 0.845 | 0.897 | 0.653 |
| 0.65 | 0.877 | 0.923 | 0.709 | 0.836 | 0.879 | 0.680 |
| 0.70 | ||||||
| 0.75 | 0.853 | 0.874 | 0.777 | 0.813 | 0.832 | 0.747 |
| 0.80 | 0.814 | 0.816 | 0.806 | 0.799 | 0.799 | 0.800 |
The row with high values of accuracy, sensitivity, and high specificity are shown in bold
Test results of the objection detection as the threshold varies
| Threshold | One-class classification | Grade classification | |
|---|---|---|---|
| Accuracy | Accuracy for Grade I | Accuracy for Grade II | |
| 0.50 | |||
| 0.55 | 0.946 | 0.222 | 0.789 |
| 0.60 | 0.943 | 0.181 | 0.570 |
| 0.65 | 0.912 | 0.097 | 0.484 |
| 0.70 | 0.852 | 0.056 | 0.359 |
| 0.75 | 0.789 | 0.042 | 0.258 |
| 0.80 | 0.684 | 0 | 0.172 |
| 0.85 | 0.596 | 0 | 0.117 |
| 0.90 | 0.450 | 0 | 0.055 |
The bold row shows the best result
Fig. 6Nomogram of classifying ROIs and its calibration curve. a Nomogram: for an unknown lesion, a vertical line of x upward to axis “Points” to assign the score indicating the probability of Grade II. The process is repeated for each variable (from x1 to x9), and the assigned scores are summed. The sum is located on axis “Total Points”, and a vertical line downward to axis “Risk” to find the lesion’s probability of Grade II. b Calibration curve of a: the x-axis represents the nomogram-estimated probabilities and the y-axis represents the observed probabilities. A perfect estimation of an ideal model is represented by the diagonal dotted line. In the diagonal dotted line, the estimated outcome perfectly corresponds to the actual outcome. The performance of a is represented by the solid line. In the solid line, a closer to the diagonal dotted line indicates a better estimation