| Literature DB >> 33975604 |
Ayten Kayi Cangir1,2, Kaan Orhan3,4, Yusuf Kahya5, Hilal Özakıncı6, Betül Bahar Kazak5, Buse Mine Konuk Balcı5, Duru Karasoy7, Çağlar Uzun8.
Abstract
INTRODUCTION: Radiomics methods are used to analyze various medical images, including computed tomography (CT), magnetic resonance, and positron emission tomography to provide information regarding the diagnosis, patient outcome, tumor phenotype, and the gene-protein signatures of various diseases. In low-risk group, complete surgical resection is typically sufficient, whereas in high-risk thymoma, adjuvant therapy is usually required. Therefore, it is important to distinguish between both. This study evaluated the CT radiomics features of thymomas to discriminate between low- and high-risk thymoma groups.Entities:
Keywords: Diagnostic tool; Machine learning; Minimally invasive surgery; Radiomics; Thymoma
Mesh:
Year: 2021 PMID: 33975604 PMCID: PMC8114494 DOI: 10.1186/s12957-021-02259-6
Source DB: PubMed Journal: World J Surg Oncol ISSN: 1477-7819 Impact factor: 2.754
Clinical characteristics of the patients with the low-risk group and high-risk group
| Patients ( | Low-risk group, | High-risk group, | |
|---|---|---|---|
| Sex, | |||
| Male | 25 (49) | 20 (62) | 0.23 |
| Female | 26 (51) | 12 (38) | |
| Age, median (range) (years) | 50 (21–73) | 50 (24–66) | 0.68 |
| Smoking status, | |||
| Never | 34 (67) | 18 (56) | 0.34 |
| Current or past smoker | 17 (33) | 14 (44) | |
| Previous malignancy or synchronous tumor, | |||
| Absent | 48 (94) | 27 (52) | 0.25 |
| Present | 3 (6) | 5 (48) | |
| Clinical presentation, | |||
| Asymptomatic | 16 (31) | 10 (31) | 0.99 |
| Symptomatic | 35 (69) | 22 (69) | |
| Myasthenia gravis, | |||
| Absent | 43 (84) | 22 (69) | 0.09 |
| Present | 8 (16) | 10 (31) | |
| LDH, median (range) (U/L) | 180 (113–338) | 179 (89–310) | 0.80 |
| ALP, median (range) (U/L) | 68 (29–124) | 67 (31–167) | 0.74 |
| CRP, median (range) (mg/L) | 4.1 (0.6–70.7) | 3.5 (0.1–25.8) | 0.67 |
| HGB, median (range) (g/dL) | 14 (9–16.6) | 13.6 (9–16.3) | 0.62 |
| WBC count, median (range) (×109/L) | 7.7 (3.3–16.8) | 7.3 (3.9–13.9) | 0.97 |
| LYMP count, median (range) (×109/L) | 2.1 (0.01–6.1) | 1.8 (0.5–3.8) | 0.10 |
| PLT count, median (range) (×109/L) | 258 (121–454) | 233 (122–404) | 0.58 |
| Type of treatment, | |||
| Surgery±adjuvant treatment | 44 (86) | 28 (87) | 1.00 |
| Definitive chemotherapy±radiotherapy | 7 (14) | 4 (13) | |
| Largest dimension of tumor size on CT (mean ± SD) (range) in mm | 66.8± 2.1 (21–160) | 63±2.2 (20–134) | 0.55 |
LDH lactate dehydrogenase, ALP alkaline phosphatase, CRP C-reactive protein, HGB hemoglobin, WBC white blood cell, LYMP lmyphocyte, PLT platelet
*Differences were compared using the t-test/Mann-Whitney U test, Pearson Chi-square test/Fisher’s exact test
Fig. 1The lung extraction and 3D representation of tumor with lung structures
Fig. 23D representation of regional segmentation of bronchus, artery, and vessels together with tumor volume
Radiomics features selected for quantifying the heterogeneity differences
| Radiomics group | Associated filter | Number of features | Radiomics features |
|---|---|---|---|
| First-order statistics | None | 126 | Energy, total energy, entropy, minimum, 10 percentile, 90 percentile, maximum, mean, median, interquartile range, range, mean absolute deviation, robust mean absolute deviation, root mean square, standard deviation, skewness, kurtosis, variance |
| Shape | None | 14 | Volume, surface area, surface volume ratio, spherical disproportion, maximum 3D diameter, maximum 2D diameter column, maximum 2D diameter row, elongation |
| Texture features | GLCM | 525 | Autocorrelation, average intensity, cluster prominence, cluster shade, cluster tendency, contrast, difference average, difference entropy, difference variance, dissimilarity, entropy, sum average, sum entropy, sum variance, sum squares |
| Texture features | GLSZM | Large area emphasis, gray level non-uniformity, size zone non-uniformity, gray-level variance, zone entropy, high gray-level zone emphasis, small area high gray-level emphasis, large area high gray-level emphasis | |
| Texture features | GLRLM | Gray-level non-uniformity, run length non-uniformity, gray level variance, run entropy, high gray-level run emphasis, short run high gray-level emphasis, long run high level emphasis |
GLCM gray-level co-occurrence matrix, GLSZM gray-level size zone matrix, GLRLM gray-level run length matrix
Fig. 3The SelectKBest method was used to further select the radiomics features; 30 features were selected
Fig. 4Lasso algorithm for feature selection. a Lasso path, b MSE path, c coefficients in the Lasso model. The Lasso model was used to select four features that correspond to the optimal alpha value
ROC results with six machine-learning classifiers of validation set
| Risk groups | Statistical measures | KNN | SVM | XGBoost | RF | LR | DT |
|---|---|---|---|---|---|---|---|
| Low | AUC | 0.943 | 0.857 | 0.8 | 0.693 | 0.943 | 0.436 |
| 95% CI | 0.85–1 | 0.66–1 | 0.58–1 | 0.45–0.93 | 0.74–1 | 0.18–0.69 | |
| Sensitivity | 0.9 | 0.8 | 0.7 | 0.8 | 0.8 | 0.3 | |
| Specificity | 0.86 | 0.86 | 0.71 | 0.43 | 0.86 | 0.57 | |
| High | AUC | 0.943 | 0.857 | 0.8 | 0.693 | 0.943 | 0.436 |
| 95% CI | 0.85–1 | 0.66–1 | 0.58–1 | 0.45–0.93 | 0.74–1.00 | 0.18–0.69 | |
| Sensitivity | 0.86 | 0.86 | 0.71 | 0.43 | 0.86 | 0.57 | |
| Specificity | 0.9 | 0.8 | 0.7 | 0.8 | 0.8 | 0.3 |
KNN k-nearest neighbor, SVM support vector machine, XGBoost eXtreme Gradient Boosting, RF random forest, LR logistic regression, DT decision tree
The results of four indicators—precision, recall, F1-score, support in validation set
| Risk groups | Indicators | KNN | SVM | XGBoost | RF | LR | DT |
|---|---|---|---|---|---|---|---|
| Low | Precision | 0.9 | 0.89 | 0.78 | 0.67 | 0.89 | 0.5 |
| Recall | 0.9 | 0.8 | 0.7 | 0.8 | 0.8 | 0.3 | |
| F1-score | 0.9 | 0.84 | 0.74 | 0.73 | 0.84 | 0.37 | |
| Support | 10 | 10 | 10 | 10 | 10 | 10 | |
| High | Precision | 0.86 | 0.75 | 0.62 | 0.6 | 0.75 | 0.36 |
| Recall | 0.86 | 0.86 | 0.71 | 0.43 | 0.86 | 0.57 | |
| F1-score | 0.86 | 0.8 | 0.67 | 0.5 | 0.8 | 0.44 | |
| Support | 7 | 7 | 7 | 7 | 7 | 7 |
KNN k-nearest neighbor, SVM support vector machine, XGBoost eXtreme Gradient Boosting, RF random forest, LR logistic regression, DT decision tree
Fig. 5ROC curves of machine-learning methods for classification. Green indicates low-risk, and red indicates high-risk thymomas. a ROC curve of the training dataset, b ROC curve of the validation dataset
The details of confusion matrix in low-risk and high-risk thymoma groups
| KNN | |||
|---|---|---|---|
| Risk groups | 0 | 1 | Accuracy (%) |
| Low | 9 | 1 | 100 |
| High | 1 | 6 | 88 |
| Accuracy (%) | 94.3 | ||
KNN k-nearest neighbor