Literature DB >> 31214524

Optimization of Brain Tumor MR Image Classification Accuracy Using Optimal Threshold, PCA and Training ANFIS with Different Repetitions.

M J Tahmasebi Birgani¹, N Chegeni², F Farhadi Birgani², D Fatehi³, Gh Akbarizadeh⁴, A Shams⁵.

Abstract

BACKGROUND: One of the leading causes of death is brain tumors. Accurate tumor classification leads to appropriate decision making and providing the most efficient treatment to the patients. This study aims to optimize brain tumor MR images classification accuracy using optimal threshold, PCA and training Adaptive Neuro Fuzzy Inference System (ANFIS) with different repetitions.
MATERIAL AND METHODS: The procedure used in this study consists of five steps: (1) T1, T2 weighted images collection, (2) tumor separation with different threshold levels, (3) feature extraction, (4) presence and absence of feature reduction applying principal component analysis (PCA) and (5) ANFIS classification with 0, 20 and 200 training repetitions.
RESULTS: ANFIS accuracy was 40%, 80% and 97% for all features and 97%, 98.5% and 100% for the 6 selected features by PCA in 0, 20 and 200 training repetitions, respectively.
CONCLUSION: The findings of the present study demonstrated that accuracy can be raised up to 100% by using an optimized threshold method, PCA and increasing training repetitions.

Entities: Chemical

Keywords: ANFIS ; Brain Tumor Detection ; PCA ; Training Repetition; MRI

Year: 2019 PMID： 31214524 PMCID： PMC6538907

Source DB: PubMed Journal: J Biomed Phys Eng ISSN： 2251-7200

Introduction

Nowadays, Computer-aided Diagnosis (CAD) has proved significant capability to improve the accuracy and reliability of diagnosis results for tumors [1,2]. It provides much information regarding the abnormality of the brain and helps physicians in planning the best treatment [3,4]. CAD can classify entirely automatic normal and abnormal brains from MR images through machine learning. Classification is employed to find patterns in mass dataset and group into diverse class labels depending on the trend of input data [5]. Numerous techniques have been reported for the classification of brain tumors in MR images, such as Artificial Neural Network (ANN), Fuzzy, Support Vector Machine (SVM), knowledge-based techniques, k-nearest neighbors (kNN), Expectation Maximization (EM) algorithm and clustering [6-14]. Another common classifier is Adaptive Neuro Fuzzy Inference System (ANFIS) which benefits from both ANN and fuzzy logic in a single framework and overcomes their individual weaknesses and suggests more outstanding features [15,16]. ANFIS classifier can also remove inaccurate information present in the image which leads to a high interpretability and good degree of accuracy [17,18]. One of the main issues that arise in classification is the large number of variables. Feature reduction (FR) is a process which selects an optimum subset of variables according to a certain criterion [19]. Generally, reasons for performing FR may comprise eliminating irrelevant data, increasing predictive accuracy of learned models, decreasing storage requirements, computational cost, run-time and improving the understanding of the data and model [20-23]. Over the last decade, numerous methods and algorithms have been proposed to reduce features in MR brain images such as principal component analysis (PCA), independent component analysis (ICA), linear discriminate analysis (LDA) and Genetic Algorithm (GA) [24-28]. PCA is the most common technique and a linear method for FR in MR image classification [29,30]. This study aims to optimize ANFIS classification accuracy by: 1) Using an optimized thresholding method to detect tumors in images with different intensities, 2) Applying the PCA algorithm and 3) Training ANFIS classifier in different repetitions. The novelty of this study is by examining the effect of training repetitions on the accuracy not addressed in recent literature.

Material and Methods

The steps involved in the proposed method for MR image classification are illustrated in a flowchart as presented in Figure 1 and explained hereafter. It involves five steps: (1) image collection, (2) image preprocessing, (3) feature extraction, (4) presence and absence of feature reduction and (5) classification. In the following, each step will be explained:

Figure1

Steps of the proposed methodology for classification of brain tumor.

Dataset

The dataset employed in this study consists of T1 and T2-weighted, 256×256 pixel MR brain images. The images were downloaded from Harvard Medical School website (http://med.harvard.edu/AANLIB/). The dataset included 140 images in which 100 images were abnormal showing a tumor, and 40 images were normal. The images used in the dataset were obtained in axial plane.

Image Preprocessing

Image preprocessing is the initial step for brain tumor detection and diagnosis process. Tumor separation steps are illustrated in Figure 2.a-h. In this step, to improve the image quality, it is essential to improve the quality of the system. At first, noise is removed from the original image (Figure 2.a) with a Gaussian Filter (Figure 2.b). Dilation and erosion are two fundamental operations in morphological image processing. Dilation is defined as the maximum value in the window. Thus, after dilation, the image will be brighter or its intensity will increase. Furthermore, dilation expands the image and is mostly used to fill in the spaces. Erosion is just opposite dilation. It is defined as the minimum value in the window. The image after erosion will be darker than the original one; it shrinks the image. The dilation and erosion of the binary image A by the structuring element B are defined by A⊕B and A⊖B, respectively. As our original images were in gray scale, and morphological operations are originally defined for binary images, the filtered image was firstly converted to a binary image by threshold method (Figure 2.c). Then, the dilation operation was applied to the binary image by disc-shaped masks with 4 pixels radius. In the next step, the holes (empty spaces) were filled (Figure 2.d). To separate the brain from image background, the original image was multiplied in a binary image, the result of which was a gray scale image of the brain (Figure 2.f). The gray scale image of the brain was converted to a binary image, afterwards, the erosion operation was applied to the binary image by disc-shaped masks with 4 pixels radius. The image was labeled to determine the number and location of objects. The area of the objects and the mean area of the total ones were calculated. The areas of less than the mean area were deleted (Figure 2.g). The ultimate image was multiplied in a gray image (Figure 2.h). The result of this step was a gray scale image of the tumor applied to calculate morphological and statistical features.

Figure2

Tumor separation steps: (a) Original image, (b) filtering, (c) primary binary image, (d) dilation and filling holes, (e) multiplying the gray in binary image, (f) ultimate binary image, (g) erosion and removing low areas, (h) filling the holes of the ultimate object

Feature Extraction

The objective in image analysis is to extract worthwhile information for solving application-based problems. Features of an image are the properties that completely describe the image. In this study, morphological and statistical features of all images were extracted and stored in an Excel file. Morphological features were Perimeter, Area, Extent, Major Axis Length, Minor Axis Length, Equivalent Diameter, Convex Hull Area and Compactness. The first order statistical features assessed in this study were mean, standard deviation and Entropy. The mean is defined as the average value of the image intensity and reveals general brightness of the image. Thus, a bright image has a high mean; a high mean represents a bright image. The standard deviation, known as the square root of the variance, exhibits the contrast that describes the data spread. An image with high contrast has a high standard deviation. Entropy represents the uniformity of the histogram and measures the number of bits required to code the image data. Second-order statistical (structural) features are obtained applying Gray-level co-occurrence matrix (GLCM). GLCM examines texture features to consider the spatial relationship of pixels also known as the gray-level spatial dependence matrix. Features extracted from GLCM were contrast (Con), homogeneity (HOM), energy (E) and entropy (EN) which were calculated in four directions; 0, 45, 90 and 135 degrees (Eq. 1). The Contrast returns a measure of the intensity difference between a pixel and its neighbor over the whole image. Contrast is 0 for a constant image. Homogeneity is a value that measures the closeness of the distribution of elements in the GLCM. Energy returns the sum of the square of elements in GLCM, which is 1 for a constant image. Entropy is a measure of randomness. (1) Where (i,j) demonstrates level and column number and Pd (i,j) is signal intensity for pixel (i,j).

Feature Reduction

In this study, PCA was used for feature reduction. Main components are the projection of the original features into Eigen vectors and correspond to the biggest Eigen values of the covariance matrix of the original feature set. The total number of morphological and statistical features was 8 and 19, respectively. Three statistical features were first-order statistical features, and 16 of them were structural features. Coefficients obtained by PCA were a matrix of 27×27. Greater efficient coefficients were in the first column and their amount reduced gradually. Finally, six of the best features were achieved using PCA applied for training and testing the designed ANFIS model. As shown in Figure 3, coefficients in order from high to low are area, Entropy 135 degrees, Homogeneity 45 degrees, Equivalent Diameter, Entropy 45 degrees and Convex Hull Area.

Figure3

PCA coefficients graph to determine the effective features for the classification of ANFIS.

ANFIS Classification

Initially, input and output data were determined to design the ANFIS model. Then, the system was trained with training data and checked with a test dataset. To protect the classifier from over-fitting, 5-fold cross validation was applied for setting train and test images Training was performed by characterizing the number of membership functions, selecting the type of training, adjusting intended error rate called Error Tolerance, determining the number of repetitions and starting training. If the error was less than Error Tolerance, the training phase was finished. In the next step, the system was checked with the test dataset. ANFIS model was designed and analyzed in this research in the following seven steps: 1. Retrieving the Excel file containing all extracted features of normal and abnormal images, 2. Dividing the data into two parts, training and testing, 3. Creating an ANFIS model based on the input data and analyzing the results, 4. Comparing the results with the actual values of the target and estimating the accuracy, 5. Plotting charts of the original values (‘Observed’), error (‘Error’) and the values predicted (‘Predicted’) by ANFIS, 6. Training ANFIS model with 0, 20 and 200 repetitions and analyzing the results, 7. Plotting charts of the observed, error and predicted values after training with 0, 20 and 200 repetitions. All mentioned steps were applied to Excel file containing selected features by PCA. Finally, the results of two operations were compared.

Results

Thresholding is one of the common methods utilized for image segmentation. Employing this method, the image is partitioned directly into different regions based on the intensity values so that the tumor can be detected. Based on imaging conditions, images reveal various intensities; thus, tumor segmentation requires a desired threshold. In this study, three thresholds of 0.4, 0.6 and 0.8 were assessed which were employed for each image automatically. Finally, the optimal threshold is the one leading to identifying only one object. As Figure 4 shows, high intensity images require higher threshold values (Figure 4.I-d) than low intensity images (Figure 4.II-b).

Figure4

Tumor detection in (І) a high intensity image and (ІІ) low intensity image. (a) Original image, the threshold of (b) 0.4, (c) 0.6 and (d) 0.8

Tumor detection in (І) a high intensity image and (ІІ) low intensity image. (a) Original image, the threshold of (b) 0.4, (c) 0.6 and (d) 0.8 Parameters used in the ANFIS model for data training, i.e. number of linear, nonlinear, total parameters and membership functions with and without PCA are illustrated in Table 1. Classification was performed in two modes; classifying with all features (without PCA) and features selected by PCA. Applying the PCA algorithm causes the number of membership functions to reduce from 45 to 3 which reduces run time from 200 to 1 minute. Figure 5 shows membership functions graphs used for data training with PCA. The accuracy of ANFIS classifier is the probability that a diagnostic test is properly performed and is calculated as follows:

Table 1

Parameters used in the ANFIS model for training data.

	groups	Linear parameters	Nonlinear parameters	Total parameters	Membership functions
Without PCA	1000	200	730	1230	45
With PCA	51	21	36	57	3

Figure5

Charts of membership functions used for data training using PCA.

(2) Parameters used in the ANFIS model for training data. Charts of membership functions used for data training using PCA. Where, TP (True Positive): Correctly classified positive cases, TN (True Negative): Correctly classified negative cases, FP (False Positive): Incorrectly classified negative cases, and FN (False Negative): Incorrectly classified positive cases. The accuracy of ANFIS model is shown in Table 2 which was enhanced using PCA from 40% to 97%. The amount of RMS increased with PCA which is reasonable as the number of features decreased although the study was conducted using only 6 features. An important finding in this study was that training repetition plays a substantial role in enhancing the accuracy of classification. As seen in Table 2, ANFIS accuracy is 40%, 80% and 97% without PCA and 97%, 98.5% and 100% with PCA in 0, 20 and 200 training repetitions, respectively. Furthermore, accuracy enhanced from 40% to 97% and from 97% to 100% applying PCA and increasing repetition, respectively. To further examine the effects of training repetition and PCA, charts of the observed (blue), error (green) and predicted (red) in 200 repetitions with and without PCA were plotted (Figure 6). In comparison with other diagrams, error is smaller, which indicates the accuracy of ANFIS is improved by PCA and increasing training repetition. The range of the error values was between -5 to 10 and -0.5 to 0.3 for training with and without PCA, respectively. Figure 7 shows changes in the selected features from PCA. The values range from 0 to 3300 and from 0 to 1 for the input and output data, respectively.

Table 2

Accuracy of the results obtained from the ANFIS model.

	All features			features obtained from PCA
repetition	0	20	200	0	20	200
Accuracy	40%	80%	97%	97%	98.5%	100%
RMSE of training	0.0012	0.0176	0.016	0.2147	0.2083	0.1945
RMSE of test	0.1165	0.1121	0.1020	0.1532	0.1774	0.1502

Figure6

Charts of the observed (blue), error (green) and predicted (red) in 200 repetitions with and without PCA.

Figure7

hanges in the obtained features from PCA.

Accuracy of the results obtained from the ANFIS model. Charts of the observed (blue), error (green) and predicted (red) in 200 repetitions with and without PCA. hanges in the obtained features from PCA. Figure 8 exhibits error changes by increasing the repetition after using PCA. The amounts of the error with PCA for the training data were descending; while test data errors first increased and then decreased. The accuracy enhanced with increasing the repetition, where the best accuracy was obtained in 200 repetitions. For further evaluation, 500 repetitions were investigated. However, error was negligible relative to 200 repetitions. As seen in Figure 8, error changes without PCA because both training and test data have not followed a regular procedure.

Figure8

The trend of changes and fluctuations of error in different repetitions (epochs) for training and test (checking) data: (a) with PCA and (b) without PCA.

Discussion

Nowadays, accuracy of classification is one of the basic challenges for classifying brain tumors in early stages. Bhardwaj approved that ANFIS classifier with accuracy greater than 90% has the potential to detect tumors [18]. As shown in Table 3, accuracy of ANFIS has been compared with other common classifiers. ANFIS combines Neuro and Fuzzy classifiers to achieve a more accurate classification [31]. For instance, the accuracy of ANN, Fussy and ANFIS classifiers is 90% [11], 98.35% [20] and 99.4% [32], respectively. The classification accuracy of KNN and SVM classifiers has been observed to be 98.6% [33] and 96% [34], respectively which are lower than the accuracy of ANFIS obtained in our study. Lakshymy proposed an algorithm to segment out a tumor from a given brain MR image using ANFIS classifier and showed that ANFIS classifier could detect tumors with an accuracy of about 99.4% [32]. Our study revealed that the classification accuracy of ANFIS could be increased up to 100% using 1) an optimum threshold method to get the morphological and structural features, 2) GLCM to obtain structural characteristics, 3) PCA to reduce the features, and 4) higher training repetition.

Table 3

Comparing the accuracy obtained in this study and that reported in recent works.

Methods	accuracy
KNN (33)	98.6%
ANN (11)	90%
ANFIS (32)	99.4%
SVM (34)	96%
Fuzzy (20)	98.35%
Proposed (ANFIS)	100%

Comparing the accuracy obtained in this study and that reported in recent works. The efficiency of PCA, as a feature reduction method to increase the accuracy, has been approved by others [3,24], This study also confirms the capabilities of PCA in increasing the accuracy of classification. As illustrated in Table 4, Rathi showed that classification accuracy with feature selection using PCA was higher than those found without PCA, and also reported that accuracy with PCA is better than Linear Discriminant Analysis (LDA) [20]. Abdullah found that using PCA reduced the number of feature vectors and improved the accuracy [35]. We conclude that applying PCA algorithm reduces the number of membership functions from 45 to 3 which decreases the run time from 200 to 1 min, and increases the accuracy from 97% to 100% in 200 training repetitions.

Table 4

Comparing accuracy with and without PCA of recent works with proposed.

classifiers	With PCA	Without PCA
KNN(20)	98.48%	95.47%
SVM (35)	85%	65%
ANFIS (Proposed)	100%	97%

Comparing accuracy with and without PCA of recent works with proposed.

Conclusion

Our results proves that ANFIS has a high capacity to increase classification accuracy of brain tumors to 100%. Generally, selection of a suitable feature reduction method and an optimized ANFIS classifier are very effective in classification accuracy. It is clear that more investigation is required for classification accuracy by optimizing feature reduction methods and classifier algorithms.

6 in total

1. Automated segmentation and classification of multispectral magnetic resonance images of brain using artificial neural networks.

Authors: W E Reddick; J O Glass; E N Cook; T D Elkin; R J Deaton
Journal: IEEE Trans Med Imaging Date: 1997-12 Impact factor: 10.048

2. Combination of feature-reduced MR spectroscopic and MR imaging data for improved brain tumor classification.

Authors: Arjan W Simonetti; Willem J Melssen; Fabien Szabo de Edelenyi; Jack J A van Asten; Arend Heerschap; Lutgarde M C Buydens
Journal: NMR Biomed Date: 2005-02 Impact factor: 4.044

3. Automatic tumor segmentation using knowledge-based techniques.

Authors: M C Clark; L O Hall; D B Goldgof; R Velthuizen; F R Murtagh; M S Silbiger
Journal: IEEE Trans Med Imaging Date: 1998-04 Impact factor: 10.048

4. Application of adaptive neuro-fuzzy inference system for epileptic seizure detection using wavelet feature extraction.

Authors: Abdulhamit Subasi
Journal: Comput Biol Med Date: 2006-02-09 Impact factor: 4.589

5. Classification of brain tumor type and grade using MRI texture and shape in a machine learning scheme.

Authors: Evangelia I Zacharaki; Sumei Wang; Sanjeev Chawla; Dong Soo Yoo; Ronald Wolf; Elias R Melhem; Christos Davatzikos
Journal: Magn Reson Med Date: 2009-12 Impact factor: 4.668

6. Review of Medical Image Classification using the Adaptive Neuro-Fuzzy Inference System.

Authors: Monireh Sheikh Hosseini; Maryam Zekri
Journal: J Med Signals Sens Date: 2012-01

6 in total

2 in total

1. Investigation the Efficacy of Fuzzy Logic Implementation at Image-Guided Radiotherapy.

Authors: Ahmad Esmaili Torshabi
Journal: J Med Signals Sens Date: 2022-05-12

2. A Novel Intelligent System for Brain Tumor Diagnosis Based on a Composite Neutrosophic-Slantlet Transform Domain for Statistical Texture Feature Extraction.

Authors: Shakhawan H Wady; Raghad Z Yousif; Harith R Hasan
Journal: Biomed Res Int Date: 2020-07-10 Impact factor: 3.411

2 in total