| Literature DB >> 34868905 |
Qi Wan1, Jiaxuan Zhou1, Xiaoying Xia1, Jianfeng Hu1, Peng Wang1, Yu Peng1, Tianjing Zhang2, Jianqing Sun2, Yang Song3, Guang Yang3, Xinchun Li1.
Abstract
OBJECTIVE: To evaluate the performance of 2D and 3D radiomics features with different machine learning approaches to classify SPLs based on magnetic resonance(MR) T2 weighted imaging (T2WI).Entities:
Keywords: algorithms; area under the curve; lung neoplasms; machine learning; magnetic resonance imaging
Year: 2021 PMID: 34868905 PMCID: PMC8637439 DOI: 10.3389/fonc.2021.683587
Source DB: PubMed Journal: Front Oncol ISSN: 2234-943X Impact factor: 6.244
Figure 1Segmentation of a nodule in the upper right lobe on T2-weighted images.
Figure 2Flow chart for the data processing.
Figure 3Combination of the pipelines for radiomics analysis.
Clinical features of training and test cohorts.
| LABEL | Training cohort | Test cohort | ||||
|---|---|---|---|---|---|---|
| benign | malignant | P-value | benign | malignant | P-value | |
| | 27 | 65 | 12 | 28 | ||
| | 48.22 ± 15.05 | 57.57 ± 10.71 | 0.001 | 51.50 ± 14.98 | 57.50 ± 9.26 | 0.13 |
|
| 3.59 ± 2.54 | 4.51 ± 2.60 | 0.126 | 3.42 ± 2.28 | 4.60 ± 2.21 | 0.131 |
|
| 0.04 | 0.121 | ||||
| Male | 15 (55.56%) | 50 (76.92%) | 5 (41.67%) | 19 (67.86%) | ||
| Female | 12 (44.44%) | 15 (23.08%) | 7 (58.33%) | 9 (32.14%) | ||
|
| 0.039 | 0.677 | ||||
| Other lobes | 20 (74.07%) | 33 (50.77%) | 6 (50.00%) | 12 (42.86%) | ||
| Upper lobe | 7 (25.93%) | 32 (49.23%) | 6 (50.00%) | 16 (57.14%) | ||
Figure 4AUC heat map in each dataset showed the performance of 2D and 3D features combined with different machine learning methods in distinguishing solitary pulmonary lesions. It can be clearly seen that the 3D feature group has much more machine learning combinations with higher AUC than 2D feature group in the test dataset. AB, Adaboost; AE, auto-encoder; ANOVA, analysis of variance; DT, decision tree; FN, feature numbers; GP, Gaussian process; LASSO, least absolute shrinkage and selection operator; LDA, linear discriminant analysis; LR, logistic regression; NB, naive Bayes; PCA, principal component analysis; PCC, Pearson correlation coefficient; RF, random forest; RFE, recursive feature elimination; SVM, support vector machine; Unitnorm, Min-max Normalization; Unit with Zerocenter, Mean Normalization; Zscorenorm, Z-score normalization.
The number of models with AUC greater than 0.7 in both validation and test groups.
| AUCval > 0.7 &AUCtest > 0.7 | AUCval> 0.7 &AUCtest > 0.8 | ||||
|---|---|---|---|---|---|
| 3D features | 2D features | 3D features | 2D features | ||
|
| PCA | 101 | 10 | 22 | 0 |
| PCC | 28 | 1 | 1 | 0 | |
|
| Min-max | 36 | 0 | 13 | 0 |
| Z-score | 45 | 11 | 1 | 0 | |
| Mean | 48 | 0 | 9 | 0 | |
|
| ANOVA | 75 | 5 | 17 | 0 |
| RFE | 54 | 6 | 6 | 0 | |
| Relief | 0 | 0 | 0 | 0 | |
|
| SVM | 24 | 0 | 5 | 0 |
| AE | 0 | 0 | 0 | 0 | |
| LDA | 33 | 0 | 3 | 0 | |
| RF | 0 | 5 | 0 | 0 | |
| LR | 41 | 0 | 3 | 0 | |
| LR + LASSO | 0 | 0 | 0 | 0 | |
| AB | 0 | 4 | 0 | 0 | |
| DT | 0 | 0 | 0 | 0 | |
| GP | 23 | 2 | 12 | 0 | |
| NB | 8 | 0 | 0 | 0 | |
AUC, area under the curve; PCA, principal component analysis; PCC, Pearson Correlation Coefficients; ANOVA, analysis of variance; RFE, recursive feature elimination; SVM, support vector machine; AE, auto-encoder; LDA, linear discriminant analysis; RF, Random forest; LR, logistic regression; LASSO, least absolute shrinkage and selection operator; AB, Adaboost; DT, Decision Tree; GP, Gaussian Process; NB, Naive Bayes.
Figure 5Receiver operating characteristic curves for 2D features, 2D + clinical features, 2D + 3D features, 3D features, and 3D + clinical features in distinguishing malignant from benign solitary pulmonary lesions.
Figure 6The precision-recall plots of optimal models based on different features.
Clinical statistics in the independent test dataset.
| AUC-ROC | 95% CI | AUC-PR | MCC | Sen | Spe | PPV | NPV | P-value | |
|---|---|---|---|---|---|---|---|---|---|
| 2D features | 0.740 | 0.716-0.800 | 0.846 | 0.404 | 0.607 | 0.833 | 0.895 | 0.476 | <0.001 |
| 2D features + Cli | 0.780 | 0.763-0.81 | 0.900 | 0.574 | 0.893 | 0.667 | 0.862 | 0.727 | <0.001 |
| 3D features | 0.824 | 0.808-0.851 | 0.927 | 0.514 | 0.643 | 0.917 | 0.947 | 0.524 | <0.001 |
| 3D features + Cli | 0.836 | 0.821-0.883 | 0.918 | 0.620 | 0.821 | 0.833 | 0.920 | 0.667 | <0.001 |
| Joint 2D&3D features | 0.813 | 0.796-0.833 | 0.926 | 0.563 | 0.607 | 1.000 | 1.000 | 0.522 | <0.001 |
AUC, area under the curve; ROC, receiver operator characteristic curve; Cli, clinical features; CI, confidence interval; PR, precision-recall plot; MCC, Matthews Correlation Coefficient; Sen, sensitivity; Spe, specificity; PPV, positive predictive value; NPV, negative predictive value.