| Literature DB >> 35915448 |
Sahan M Vijithananda1, Mohan L Jayatilake2, Badra Hewavithana1, Teresa Gonçalves3, Luis M Rato3, Bimali S Weerakoon4, Tharindu D Kalupahana5, Anil D Silva6, Karuna D Dissanayake7.
Abstract
BACKGROUND: Diffusion-weighted (DW) imaging is a well-recognized magnetic resonance imaging (MRI) technique that is being routinely used in brain examinations in modern clinical radiology practices. This study focuses on extracting demographic and texture features from MRI Apparent Diffusion Coefficient (ADC) images of human brain tumors, identifying the distribution patterns of each feature and applying Machine Learning (ML) techniques to differentiate malignant from benign brain tumors.Entities:
Keywords: ANOVA f-test feature selection; Apparent diffusion coefficient; Brain tumor classification; Diffusion weighted imaging; Machine learning; Magnetic resonance imaging; Random forest
Mesh:
Year: 2022 PMID: 35915448 PMCID: PMC9344709 DOI: 10.1186/s12938-022-01022-6
Source DB: PubMed Journal: Biomed Eng Online ISSN: 1475-925X Impact factor: 3.903
Fig. 1Supervised learning method applying to tumor classification. The flow chart illustrates the steps of building a classification model to differentiate brain neoplasms using supervised learning technique. Here, the problem was identified as a classification problem at the initial stage and then the necessary data was collected as the second step. Data pre-processing was executed as the third step and at the fourth step, the data set was split into training and testing sets. Then a suitable ML algorithm for the collected data was selected as the fifth step of the study flow and then, the selected algorithm was trained with the training data as the sixth step. Finally, the developed algorithm was evaluated with the test data and the hyperparameter of the developed model was tuned to reach the optimum accuracy level of the model
ANOVA f-test feature selection
| Feature | ANOVA |
|---|---|
| Mean pixel value of ADC | 32.3343 |
| Skewness | 3.3444 |
| Kurtosis | 9.6250 |
| GLCM mean 1 | 32.6372 |
| GLCM mean 2 | 29.1327 |
| GLCM Var 1 | 14.0761 |
| GLCM Var 2 | 27.5219 |
| GLCM energy | 33.9675 |
| GLCM entropy | 4.989 |
| GLCM contrast | 47.9462 |
| GLCM homogeneity | 3.4572 |
| GLCM correlation | 48.6392 |
| GLCM prominence | 15.4134 |
| GLCM shade | 17.1677 |
| Patient age | 9.4337 |
| Patient gender | 73.7926 |
The table visualize the performance of each feature at the ANOVA f-test Feature Selection process. The data set went through the ANOVA f-test Feature Selection process for 5 times and the mean values were calculated. There were slight differences of values received at each time due to stochastic nature of the algorithm, or differences in numerical precision or evaluation procedure
Fig. 4ANOVA f-test results chart. ANOVA f-test score for attributes 0 to 15 are illustrated in the graph; mean pixel value of ADC 32.3343, Skewness 3.3444 Kurtosis 9.6250, GLCM Mean1 32.6372, GLCM mean2 29.1327, GLCM variance1 14.0761, GLCM variance2 27.5219 GLCM energy, GLCM Homogeneity 3.4572, 33.9675, GLCM Entropy 4.989, GLCM contrast 47.9462, GLCM Correlation 48.6392, GLCM prominence 15.4134, GLCM Shade 17.1677, Patient Age 9.4337 and Patient Gender 73.7926
Results of the cross validation experiment
| Algorithm | Mean accuracy | Accuracy as percentage (%) | Standard deviation (SD) |
|---|---|---|---|
| Logistic regression | 0.753378 | 75.33 | 0.034451 |
| Linear discriminant analysis | 0.748898 | 74.89 | 0.036810 |
| k-Nearest neighbors classifier | 0.828459 | 82.84 | 0.030710 |
| Decision tree classifier | 0.800764 | 80.07 | 0.045553 |
| GaussianNB | 0.748922 | 74.89 | 0.052582 |
| SVC | 0.815082 | 81.50 | 0.043396 |
| Random forest classifier | 0.843629 | 84.36 | 0.042054 |
The table visualize the performance of each machine learning algorithm received at the cross-validation experiment over the training data set and the standard deviations for each result
Classification report (without optimizing the model) shows a binary classification of the data set with Random Forest Classifier
| Tumor type | Precision (%) | Recall (%) | F1-score (%) | Support |
|---|---|---|---|---|
| Malignant | 85 | 92 | 89 | 299 |
| Benign | 85 | 73 | 79 | 181 |
| Accuracy | 85 | 480 | ||
| Macro average | 85 | 83 | 84 | 480 |
| Weighted average | 85 | 85 | 85 | 480 |
Classification report: performance of Random Forest after hyperparameter optimization to have best precision score
| Tumor type | Precision (%) | Recall (%) | F1-score (%) | Support |
|---|---|---|---|---|
| Malignant | 89 | 94 | 92 | 299 |
| Benign | 90 | 81 | 85 | 181 |
| Accuracy | 89 | 480 | ||
| Macro average | 89 | 87 | 88 | 480 |
| Weighted average | 89 | 89 | 89 | 480 |
Optimum level of hyper parameters for maximum precision score and the maximum Recall score for the selected features, where n estimators is the number of trees in random forest, min sample split is the minimum number of samples required to split a node and max depth is the maximum number of levels in tree
| Hyper parameter | Best condition for Precision | Best condition for Recall |
|---|---|---|
| 500 | 300 | |
| Min sample split | 2 | 2 |
| Max features | 10 | 10 |
| Max depth | 70 | 30 |
Classification report: performance of Random Forest after hyperparameter optimization to have best recall score
| Tumor type | Precision (%) | Recall (%) | F1-score (%) | Support |
|---|---|---|---|---|
| Malignant | 91 | 93 | 92 | 299 |
| Benign | 88 | 85 | 86 | 181 |
| Accuracy | 90 | 480 | ||
| Macro average | 87 | 85 | 86 | 480 |
| Weighted average | 87 | 87 | 87 | 480 |
Fig. 5Precision–recall curve; visualize the sensitivity–specificity trade-off in the classifier the information provided by the curve used to set the decision threshold of the model to maximize the sensitivity and specificity
Fig. 2Final confusion matrix. The confusion matrix express the performance of the optimized benign malignant brain tumor brain tumor classification model over the test set
Fig. 6Receiver operating characteristic curve (ROC Curve). The curve illustrates the behaviour of the false positive rate (x-axis) and true positive rate (y-axis) for a series of different decision threshold values in between 1.00 and 0.00. The smaller values of the X-axis represent the lower false positive rate, and the higher true negative rate. In addition, the larger values of Y-axis represent the lower false negative rates and higher true positive rates
Tumor types and percentages belonging to each benign and malignant categories
| Category | WHO grading | Tumor type | Image slices | Percentage (%) |
|---|---|---|---|---|
| Benign | WHO I | Meningioma | 262 | 43.38 |
| Schwannoma | 135 | 22.35 | ||
| Pilocytic astrocytoma | 13 | 2.15 | ||
| Hemangioblastoma | 16 | 2.65 | ||
| Craneopharyngioma | 11 | 1.82 | ||
| Dermoid cyst | 13 | 2.15 | ||
| WHO II | Low grade gliomas | 112 | 18.54 | |
| Meningioma | 21 | 3.48 | ||
| Astrocytoma | 10 | 1.67 | ||
| Ependymoma | 7 | 1.16 | ||
| Frontal cavernoma | 4 | 0.66 | ||
| Malignant | WHO III | High grade gliomas | 147 | 14.77 |
| Anaplastic astrocytomas | 22 | 2.21 | ||
| Anaplastic meningioma | 11 | 1.10 | ||
| Anaplastic oligodendro glioma | 29 | 2.91 | ||
| Central astrocytomas | 65 | 6.53 | ||
| WHO IV | Glioblastomas | 442 | 44.42 | |
| Medulloblastoma | 109 | 10.95 | ||
| Metasasis | 170 | 17.08 |
According to the radiological and histopathological reports, there were 995 malignant brain image slices, including WHO (World Health Organization) Grade IV tumors; 442 glioblastomas, 109 medulloblastoma, 170 metasasis/residual malignancies, and WHO Grade III tumors; 147 high grade gliomas, 22 anaplastic astrocytomas, 11 anaplastic meningioma, 29 anaplastic oligodendro glioma, 65 central astrocytomas within the population. Also there were 604 benign brain tumors slices with WHO Grade I; 13 pilocytic astrocytoma, 262 meningioma, 135 shwannoma, 16 hemangioblastoma, 11 craneopharyngioma, 13 Dermoid cysts, and WHO Grade II; 10 astrocytoma, 21 meningiomas, 112 low grade gliomas, 7 ependymomas, 4 frontal cavernoma
Fig. 3MRI ADC brain image of a 14-year-old female patient diagnosed with pilocytic astrocytoma which was radiologically and histo-pathologically identified as a benign tumor. The tumor area is surrounded by the ROI. The texture features were extracted form the selected area