| Literature DB >> 34966664 |
Chen Chen1, Yuhui Qin1, Junying Cheng2, Fabao Gao1, Xiaoyue Zhou3.
Abstract
OBJECTIVE: We used texture analysis and machine learning (ML) to classify small round cell malignant tumors (SRCMTs) and Non-SRCMTs of nasal and paranasal sinus on fat-suppressed T2 weighted imaging (Fs-T2WI). MATERIALS: Preoperative MRI scans of 164 patients from 1 January 2018 to 1 January 2021 diagnosed with SRCMTs and Non-SRCMTs were included in this study. A total of 271 features were extracted from each regions of interest. Datasets were randomly divided into two sets, including a training set (∼70%) and a test set (∼30%). The Pearson correlation coefficient (PCC) and principal component analysis (PCA) methods were performed to reduce dimensions, and the Analysis of Variance (ANOVA), Kruskal-Wallis (KW), and Recursive Feature Elimination (RFE) and Relief were performed for feature selections. Classifications were performed using 10 ML classifiers. Results were evaluated using a leave one out cross-validation analysis.Entities:
Keywords: Fs-T2WI; artificial intelligence; machine learning; radiomics; small round cell malignant tumors; texture analysis
Year: 2021 PMID: 34966664 PMCID: PMC8710453 DOI: 10.3389/fonc.2021.701289
Source DB: PubMed Journal: Front Oncol ISSN: 2234-943X Impact factor: 6.244
Texture analysis methods and the corresponding texture features.
| Method | Texture feature parameters |
|---|---|
| Histogram(9) | mean, variance, skewness, kurtosis, and percentiles (1%, 10%, 50%, 90% and 99%) |
| Grey-level CO-occurrence matrix(GLCM)(220) | angular second moment(AngScMom), contrast, inversedifferent moment(IDM), entropy(Ent), correlation(Correlat), sum of squares(SumOfSqs), sum average(SumAverg), sum variance(SumVarnc), sum entropy(SumEntrp), difference variance(DifVarnc), difference entropy(DifEntrp) along the 0°, 45°, 90°, 135° and z‐axis directions and 1, 2, 3 and 4 pixels |
| Grey‐level run‐length matrix(GLRLM)(20) | run length nonuniformity(RLNonUni), grey level nonuniformity(GLevNonU), long run emphasis(LngREmph), short run emphasis(ShrtREmp), fraction of image in runs(Fraction) of four different angels (horization, vertical, digonal45, and digonal135) |
| Auto‐regressive model(ARM)(5) | Teta1, Teta2, Teta3, Teta4, Sigma |
| Wavelets transform(WAV)(12) | energy computed from the low–low frequency band within the first image scale(WavEnLL_s-1), WavEnLH_s-1, WavEnHL_s-1, WavEnHH_s-1, WavEnLL_s-2, WavEnLH_s-2, WavEnHL_s-2, WavEnHH_s-2, WavEnLL_s-3, WavEnLH_s-3, WavEnHL_s-3, WavEnHH_s-3 |
| Absolute gradient statistics(AGS)(5) | absolute gradient mean(GrMean), variance(GrVariance), skewness(GrSkewness) kurtosis(GrKurtosis), nonzeros(GrNonZeros) |
The parameters of the algorithms.
| Algorithms | Parameters |
|---|---|
| SVM | C=1.0, kernel='rbf', degree=3, gamma='scale', coef0=0.0, shrinking=True, probability=False, tol=0.001, cache_size=200, class_weight=None, verbose=False, max_iter=- 1, decision_function_shape='ovr', break_ties=False, random_state=None |
| AE | hidden_layer_sizes=(100), activation='relu', *, solver='adam', alpha=0.0001, batch_size='auto', learning_rate='constant', learning_rate_init=0.001, power_t=0.5, max_iter=200, shuffle=True, random_state=None, tol=0.0001, verbose=False, warm_start=False, momentum=0.9, nesterovs_momentum=True, early_stopping=False, validation_fraction=0.1, beta_1=0.9, beta_2=0.999, epsilon=1e-08, n_iter_no_change=10, max_fun=15000 |
| LDA | solver='svd', shrinkage=None, priors=None, n_components=None, store_covariance=False, tol=0.0001 |
| RF | n_estimators=100, *, criterion='gini', max_depth=None, min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_features='auto', max_leaf_nodes=None, min_impurity_decrease=0.0, min_impurity_split=None, bootstrap=True, oob_score=False, n_jobs=None, random_state=None, verbose=0, warm_start=False, class_weight=None, ccp_alpha=0.0, max_samples=None |
| LR | penalty='l2', *, dual=False, tol=0.0001, C=1.0, fit_intercept=True, intercept_scaling=1, class_weight=None, random_state=None, solver='lbfgs', max_iter=100, multi_class='auto', verbose=0, warm_start=False, n_jobs=None, l1_ratio=None |
| LRLasso | alpha=1.0, *, fit_intercept=True, normalize=False, precompute=False, copy_X=True, max_iter=1000, tol=0.0001, warm_start=False, positive=False, random_state=None, selection='cyclic' |
| AB | base_estimator=None, *, n_estimators=50, learning_rate=1.0, algorithm='SAMME.R', random_state=None |
| DT | criterion='gini', splitter='best', max_depth=None, min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_features=None, random_state=None, max_leaf_nodes=None, min_impurity_decrease=0.0, min_impurity_split=None, class_weight=None, ccp_alpha=0.0 |
| GP | kernel=None, *, optimizer='fmin_l_bfgs_b', n_restarts_optimizer=0, max_iter_predict=100, warm_start=False, copy_X_train=True, random_state=None, multi_class='one_vs_rest', n_jobs=None |
| NB | alpha=1.0,binarize=0.0,fit_prior=True,class_prior=None |
Figure 1Performance of models generated using Pearson correlation coefficient (PCC) analyses, and Relief and Gaussian process (GP) algorithms. (A) Receiver operating characteristic (ROC) curves of this model on different datasets, (B) FeAture Explorer software suggested a candidate 13-feature model according to the “one-standard error” rule, and (C) A contribution of features in the final model.
Figure 2Areas under the curve (AUCs) of the different datasets using Pearson correlation coefficient (PCC) and principal component analysis (PCA) methods and Relief and Gaussian process (GP) algorithms. (A) Analysis of Variance (ANOVA), (B) Kruskal-Wallis (KW), (C) Relief, and (D) Recursive Feature Elimination (RFE).
Figure 3Areas under the curve (AUCs) on different datasets using the Analysis of Variance (ANOVA), Kruskal-Wallis (KW), and Recursive Feature Elimination (RFE) and Relief using Gaussian process (GP). (A) Pearson correlation coefficient (PCC), (B) principal component analysis (PCA).
Figure 4Areas under the curve (AUCs) of different datasets using 10 machine learning algorithms. (A) Pearson correlation coefficient (PCC), (B) principal component analysis (PCA). SVM, support vector machine; LDA, linear discriminant analysis; AE, auto-encoder; RF, random forests; LR, linear regression; LRLasso, logistic regression via Lasso; AB, ada-boost; DT, decision tree; GP, Gaussian process; NB, naive Bayes.