Literature DB >> 23626942

Fuzzy-based Medical X-Ray Image Classification.

Fatemeh Ghofrani¹, Mohammad Sadegh Helfroush, Mahmoud Rashidpour, Kamran Kazemi.

Abstract

In this paper, a novel fuzzy scheme for medical X-ray image classification is presented. In this method, each image is partitioned into 25 overlapping subimages and then, we extracted the shape-texture features from shape and directional information of each subimage. In the classification step, we apply a fuzzy membership to each subimage considering the Euclidean distance between feature vector of each subimage and average of feature vectors of training subimages. At last, a hard classification of the test image can be obtained by performing a max operation on the summation of fuzzy memberships. The proposed method is evaluated for image classification on 2655 radiographic images from IRMA dataset with 300 training samples and 2355 test samples. Classification accuracy rates obtained by fuzzy classifier are higher than that of obtained by multilayer perceptron or even SVM classifier.

Entities: Chemical Disease Gene

Keywords: Fuzzy classifier; medical X-ray images; multilayer perceptron; shape-texture features; support vector machines

Year: 2012 PMID： 23626942 PMCID： PMC3632044

Source DB: PubMed Journal: J Med Signals Sens ISSN： 2228-7477

INTRODUCTION

By increasing uses of medical X-ray images for diagnosing, efficient medical images retrieval becomes a necessity. Classification is one of the most important stages of image retrieval. Thus a certain classification plays a fundamental role in development of image retrieval. Several medical X-ray image classification schemes are presented in the literature.[1-5] In 2003, Keysers et al.[6] developed a new content-based medical image retrieval scheme. In this method, images have been classified based on image modality, body orientation, anatomic region, and biological system. Performance of this classifier has been evaluated on 1617 training images and 332 test images from six classes of IRMA dataset and an error rate of 8% was obtained. In 2007, a new image classification method has been proposed by Mueen et al.[7] In this method, multilevel features have been extracted from 9000 training images and 1000 test images. Then Support Vector Machines (SVM) have been utilized in the classification stage. It should be noted that a support vector machine learns the decision surface from two distinct classes of the input points.[8] The accuracy rate of this method for 57 classes was 89%. In 2009, Iakovidis et al.[5] presented a content-based medical image retrieval scheme. This method that utilizes similarity measures, defined over higher level patterns that were associated with clusters of low level image feature spaces. Performance of this scheme has been evaluated on 9000 training images and 1000 test images from 116 classes of IRMA dataset. The accuracy rate of this scheme on a subset of available data, generated according to the guidelines provided in[1] was 78%. In this method, training images were registered in the database. So some additional computation for registration of training images was necessary and it may not be so attractive. Also, a large number of training images was used in this method. With respect to this point that labeling of a large number of samples is expensive and time consuming, and majority of proposed methods for X-ray image classification are based on a large number of training samples, the basic motivation for medical X-ray image classification using a small number of training samples still remains as a challenge. Recently, fuzzy set theory is used in many applications[9] and particularly in the field of pattern clustering and classification tasks.[10-12] Resent work in fuzzy-based medical image classification schemes has focused on extracting the fuzzy features. For example, Iakovidis et al.[13] developed a novel approach for thyroid ultrasound pattern representation. In this approach, a feature extraction scheme based on fusion of a fuzzy distribution of local binary patterns and ultrasound echogenicity represented by the fuzzy gray-level histogram is presented. A new method of content-based radiology medical image retrieval is presented.[14] The description of images, in their work, relies on a fuzzy rule based Compact Composite Descriptor (CCD), which includes global image features capturing both brightness and texture characteristics in a 1D Histogram. These approaches do not take advantages of fuzzy set theory in the classification stage. In this paper, in order to obtain a good performance, we propose a classifier based on fuzzy set theory. The aim of this paper is to introduce a new efficient scheme for medical X-ray image classification. Unfortunately, a typical characteristic of medical X-ray images is their large variations within a class, and also strong visual similarities across different classes.[1] Hence, the determination of a category is, in general, a difficult task. It is fortunate that in this case, fuzzy set theory allows us to easily determine the degree of membership of each image to different categories. Considering these points, in this paper a fuzzy-based scheme for medical X-ray image categorization is proposed. The results show that a fuzzy classifier performs more efficient than well-known classifiers such as multilayer perceptron (MLP) and even support vector machines (SVM). The paper is organized as follows. Section II presented the procedure of feature extraction. In Section III, we described our fuzzy classification scheme. Experimental results are shown in section IV. Finally, conclusions are summarized in section V.

Feature Extraction

Texture, shape, color, and spatial location features are often used in pattern classification. Since X-ray images are gray level and their texture characteristic is very similar, so color and texture features may not be suitable for medical X-ray images classification. In this paper, a combination of shape and texture features is used. Feature extraction algorithm diagram is shown in Figure 1. The features extraction procedure is described as follows:

Figure 1

Feature extraction algorithm diagram

Histogram Adjustment

In this stage, the contrast of input images is improved by mapping the intensity values of each image to new values, such that 1% of data are saturated at low and high intensities of original image.

Image Denoising

In images in which the brightness gradient generated by the noise is greater than that of the edges, and the level of the noise varies significantly across the image, a global noise estimate does not provide an accurate local estimate and the local value of the gradient provides too partial a piece of information for distinguishing noise-related and edge-related gradients.[15] Hence, in order to remove noise from images, the anisotropic diffusion filter that was proposed by Perona et al,[16] is applied. They put forward an anisotropic diffusion (AD) equation to smooth a noisy image that is given by the expression where u(x,y,t):Ω×[0,+∞)→ R is a scale image obtained by convolving the original image u0(x, y) with a Gaussian kernel G(x, y;t) of variance t, g(|蜑u|) is a decreasing function of gradient.

Detecting Edge Using Canny Edge Detection

The Canny method[17] finds edges by looking for local maxima of the gradient of each image. The gradient is obtained using the derivative of a Gaussian filter. The method uses two thresholds to detect strong and weak edges, and includes the weak edges in the output only if they are connected to strong edges. This method is therefore less likely than the others to be fooled by noise, and more likely to detect true weak edges.

Calculating Phase Congruency

As shown in Figure 2, to remove unnecessary edges and boundaries and also to increase the edge features accuracy, the phase congruency of shape image and original image are computed.[18] The phase congruency of original image and the resulting image of previous step is calculated. Then these two resulting images are multiplied. The measure of phase congruency proposed by Morrone et al.[19] is

Figure 2

Phase congruency computation results; (a) Computing phase congruency of edge images; (b) Computing phase congruency of original images; (c) Multiplication results of edge images[20]

Phase congruency computation results; (a) Computing phase congruency of edge images; (b) Computing phase congruency of original images; (c) Multiplication results of edge images[20] where |E(x)| (local energy), is the magnitude of the vector from the origin to the end point and A(x) is the amplitude of local, complex valued, Fourier components at a location x in the signal.

Partitioning Image into 25 Subimages

In this stage, each image is partitioned into 25 subimages. In order to keep the information of boundaries of subimages, image partitioning into overlapping subimages is used. Note that by using the overlapping subimages scheme, each image can be partitioned into (2n+1)2 subimages for n = 1, 2, 3, … and in this paper, partitioning into 25 subimages (n = 2) is poposed, because 9 subimages (n = 1) cannot illustrate details clearly and by using the 49 subimages (n = 3), the number of extracted features will be increased.

Computing Discrete Gabor Transform

After preprocessing stages, we compute Gabor transform of each subimage. Gabor filters are a group of wavelets, with each wavelet capturing energy at a specific frequency and a specific direction.[20] Frequencies and directions utilized in this paper are shown in Figure 3. For a given image I(x, y) with size P×Q, discrete Gabor transform[20] is obtained using the following expressions:

Figure 3

Frequencies and directions of Gabor filters used in this work

Frequencies and directions of Gabor filters used in this work where s and t are the filter mask size variables, and φ* is the complex conjugate of φ which is a class of self-similar functions generated from expansion and rotation of the following mother wavelet: where ω is the modulation frequency. The self-similar Gabor wavelets are obtained through the generating function: where m and n indicate the scale and direction of the wavelet respectively, with m = 0,1,…,M – 1, n = 0,1,…,N – 1, and where a > 1 and . The variables in the previous equations are defined as follows: In our implementation, the following values are used for mentioned parameters as commonly used in the literature: U = 0.05, U = 0.4, M=3, N=6, s and t range from 0 to 33, i.e., filter mask size is 33×33.

Extracting Shape- texture features

Finally, in the last stage of feature extraction, two features are extracted from any filtered subimage. These features are calculated as follows:[18] where I(i, j) is the result of applying Gabor transform to each subimage with size P×Q: Discrete wavelet transform is applied to I(i, j) and finally A(i, j) are coefficients of the low-frequency subband in an orthonormal basis.

PROPOSED CLASSIFICATION METHOD

A new scheme for the fuzzy classification of medical X-ray images is proposed in the present paper. The proposed fuzzy classification scheme extracts information of different parts of input pattern relating to each class. Since all parts of an image are not equally important in discriminating all classes, the partwise membership is expected to help in the classification performance. An illustration of the proposed classification method is shown in Figure 4.

Figure 4

Proposed classification method

Proposed classification method The proposed scheme works in two steps. In the first step, the system takes a feature vector of input pattern and fuzzifies its different parts using a membership function, and provides the membership of individual parts to different classes. A membership matrix thus obtained contains number of rows and columns equal to the number of subimages or parts and classes, respectively. In the present study, we have used a membership function [Figure 5] to fuzzify an input pattern. Thus, the first step of the proposed fuzzy classification scheme extracts the hidden information of different subimages to all classes that may be helpful for achieving better classification results. The advantage of using this type of membership function is that it has two fuzzification parameters, which can be tuned to produce the best classification results.

Figure 5

Membership function as a function of Di

Membership function as a function of Di The second step of the proposed fuzzy classifier is a hard classification by applying a max operation to defuzzify the output results. An image is assigned to the class which has the highest membership value.

Fuzzification

The membership function generates a partwise degree of similarity of an image to different classes by fuzzification. Here we have utilized a membership function to model a class according to a Euclidean distance. The distance is between feature vector of any subimage and average of feature vectors of training subimages for fuzzification process. By varying the values of the fuzzification parameters (F1, F2), the steepness of the membership function can be controlled. This function is defined as shown in Figure 5 where ε is a very small positive value. To calculate D, suppose x1 and x2 are feature vectors of subimage i and the average of feature vectors of training subimages in class k with fixed size N, respectively. Then Euclidean distance between x1 and x2 or D is defined as follows: Finally, the normalized membership function of any subimage can be defined as In the present study, it should be noted that the fuzzification parameters are tuned to achieve the best performance. We have selected the fuzzification parameters F1 and F2 as 15 and 850 respectively, and ε = 10–4 is considered.

Defuzzification

The last step of the proposed fuzzy classification model is a hard classification by performing a max operation to defuzzify the output results. The image is assigned to class k corresponding to the highest membership value. Mathematically, the class number of a test image is determined as follows:

EXPERIMENTAL RESULTS

In this section, the result of implementation of proposed method and comparison with some other classification methods are presented. The images utilized for implementation include 2655 radiographic images with different sizes (example images are represented in Figure 6). These images are from 20 classes of IRMA database. We have taken 300 and 200 images as training samples, e.g., 15 and 10 samples for each class, and 2355 test samples. These classes are listed in Table 1. Using a small number of training samples is the most important point of this work.

Figure 6

(a) Some images with different sizes from the images archives. (b) Some images with bad conditions from the images archives

Table 1

X-ray image classes

(a) Some images with different sizes from the images archives. (b) Some images with bad conditions from the images archives X-ray image classes Here, we present our results at a global accuracy, meaning the performance on the complete dataset, but also at a class-specific level, where we average the classification rates of all classes evaluated separately. We follow this procedure, as we encounter an unbalance database.[2] After some preprocessing, shape-texture features are extracted using Gabor filters. Gabor filters extract the features from midfrequency or higher bands. Hence, the features that are extracted from the filtered X-ray images could not be features with rich information content. On the other hand, the spectrum of extracted edges for an image is more spread than that of original image. Thus, the features achieved from filtered extracted edges will be more efficient. First, just log-energy entropy features were extracted and applied for classification method. Then, we extracted another feature, standard deviation, to improve the performance of this work. Tables 2 and 3 depict the classification results using log-energy entropy feature and its combination with standard deviation feature for 300 training data, respectively. Also, the performance comparison results for 200 and 100 training samples using combination of features is shown in Tables 4 and 5. In each case, in order to reduce the feature space dimensionality and computational complexity and also to increase their capability, the Principal Component Analysis (PCA)[21] algorithm is utilized. Also, to achieve the best classification results, the optimum length of the feature vector is selected by several iterations.

Table 2

Classification accuracy rates obtained by four different classifiers using log-energy entropy features for 300 training images

Table 3

Classification accuracy rates obtained by four different classifiers using combination of log-energy entropy and standard deviation features for 300 training images

Table 4

Classification accuracy rates obtained by four different classifiers using combination of log-energy entropy and standard deviation features for 200 training images

Table 5

Classification accuracy rates obtained by four different classifiers using combination of log-energy entropy and standard deviation features for 100 training images

Classification accuracy rates obtained by four different classifiers using log-energy entropy features for 300 training images Classification accuracy rates obtained by four different classifiers using combination of log-energy entropy and standard deviation features for 300 training images Classification accuracy rates obtained by four different classifiers using combination of log-energy entropy and standard deviation features for 200 training images Classification accuracy rates obtained by four different classifiers using combination of log-energy entropy and standard deviation features for 100 training images To evaluate proposed classification method, three different classifiers consist of polynomial and Gaussian kernel support vector machines and MLP are utilized to compare with proposed classifier. As SVM is basically bi-classifier, we can use it to classify data into multi-class by different ways such as one-against-all, one-against-one and hierarchical. In this work, one-against-one method[22] is used. Furthermore, in the SVM classifier, two parameters exist for polynomial and Gaussian kernels:C and γ. For a given problem, It is not known beforehand which C and γ are the best, so a good selection of these parameters must be done. A common strategy is to separate the data set into two parts, of which one is considered unknown. The prediction accuracy obtained from the unknown set more precisely shows the performance on classifying an independent data set. An improved version of this procedure is known as cross-validation. In v-fold cross-validation, we first divide the training set into v subsets of equal size. Sequentially one subset is tested using the classifier trained on the remaining v-1 subsets. Thus, each instance of the whole training set is predicted once so the cross-validation accuracy is the percentage of data which are correctly classified. We applied a grid-search on C and γ using cross-validation. Various pairs of (C, γ) values are tried and the one with the best cross-validation accuracy is picked. It should be noted that trying exponentially growing sequences of C and is a practical method to identify good parameters (for example, C = 2–5, 2–3, …, 2–15, γ = 2–15, 2–13, …, 23.).[23] Hence, in this paper, we use a grid search and find the best pairs of (C, γ) values with five-fold to obtain the best results for SVM classifier. These values are listed in Table 6.

Table 6

The best selected parameters for SVM classifier using the grid search

The best selected parameters for SVM classifier using the grid search Table 2 shows that for 300 training images and by using the log-entropy features, MLP provides a classwise accuracy 56.1% and the global accuracy as 56.22%. These values are 75.42% and 70.1% with polynomial kernel SVM and 75.6% and 70.23% with Gaussian kernel. The proposed fuzzy classifier provides an improved classwise accuracy and global accuracy values and they are 78.61% and 80.38%, respectively. Thus there is an increase of nearly 22% of classwise accuracy by the proposed method, and the corresponding global accuracy value is increased by 24% compared to MLP. Similarly, an increase of nearly 3% and 10% in classwise and global accuracy are obtained compared with the SVM classifier, respectively. Also, an improvement with the proposed fuzzy classifier can be observed with combination of log-entropy and standard deviation features for 300 training data, i.e., with the proposed classifier, polynomial kernel SVM, Gaussian kernel SVM and MLP methods classwise accuracies are 82.32%, 82.15%, 82.08%, and 52.24%, respectively. Also there is an increase in global accuracy of 3% with the proposed method compared with SVM and of 27% with MLP. From these values, it is obvious that the proposed classifier provides a better global and classwise accuracy compared with the SVM and MLP [Table 3]. The superiority of the proposed fuzzy classifier is also validated with 200 and 100 training images as shown in Tables 4 and 5, respectively. A comparison is made among the four classifier mentioned earlier. Table 4 shows that the classwise accuracy with the proposed classifier using combination of features and 200 training images is 76.6%, which is more than 76.01%, 75.9%, and 37.7% obtained using Gaussian kernel SVM, polynomial kernel SVM, and MLP, respectively. Also, an improvement of global accuracy value is obtained by the proposed classifier. For example, the global accuracy values are 77.02%, 75.49%, 75.28%, and 31.54%, respectively. In addition, Tables 4 and 5 also reveal that the global accuracy rate obtained with the proposed classifier for 100 training data is nearly the same as with the SVM at 200 training data. This is particularly important when there is a scarcity of training images. From the classification results of four different classifiers, it is observed that for all cases the proposed fuzzy classifier performed better than the MLP and even SVM with optimum parameters. For 300 training images and by using combination of features, the summarized results are shown in Table 7.

Table 7

Summarized results obtained by four different classifiers for 300 training images

CONCLUSION

A novel fuzzy scheme for medical X-ray image classification has been presented. In order to improve performance of medical X-ray image classification, an effective fuzzy classifier was introduced. The experimental results demonstrated the efficiency of proposed scheme. Future works involved evaluating performance of fuzzy SVM and fuzzy Neuron Network classifiers for medical X-ray image classification.

BIOGRAPHIES

Fatemeh Ghofrani received the B.S. degree in Electrical engineering in 2009 from Shiraz University, Iran and the M.S. degree in Communication engineering from Shiraz University of Technology, Iran, in 2011 with First Class Honour. Her research interests include image processing, pattern recognition and machine learning. E-mail: f.ghofrani@sutech.ac.ir Mohammad Sadegh Helfroush received the B.S. and M.S. degrees in Electrical engineering from Shiraz University of Shiraz and Sharif University of Technology, Tehran in 1993 and 1995, respectively. He performed his Ph.D. degree in Electrical Engineering from Tarbiat Modarres University, Tehran, Iran. Actually he is working as associate professor in department of Electrical and Electronics Engineering, Shiraz University of Technology, Shiraz, Iran. E-mail: ms_helfroush@sutech.ac.ir Mahmoud Rashidpour received the B.S. degree in Electrical engineering and M.S. degree in Communication Engineering from Shiraz University, Iran. He performed his Ph.D. degree in Communication Engineering from University of Tehran, Iran in 2004. Actually he is working as assistant professor in department of Electrical and Electronics Engineering, Shiraz University of Technology, Shiraz, Iran. E-mail: rashidpour@sutech.ac.ir Kamran Kazemi received the B.S. and M.S. degrees in Electrical engineering from Ferdowsi University of Mashhad and K. N. Toosi University of technology, Tehran in 2000 and 2002, respectively. He performed his Ph.D. degree in Electrical Engineering and Biomedical Engineering as cooperation between K. N. Toosi University of Technology, Tehran, Iran and University of Picardie Jules Verne (UPJV) Amiens, France. Actually he is working as assistant professor in department of Electrical and Electronics Engineering, Shiraz University of Technology, Shiraz, Iran and also as invited researcher in the GRAMFC (Groupe de Recherche sur l’Analyse Multimodale de la Fonction Cérébrale) lab at the University of Picardie Jules Verne (UPJV), Amiens, France. E-mail: kazemi@sutech.ac.ir

7 in total

1. Fusion of fuzzy statistical distributions for classification of thyroid ultrasound patterns.

Authors: Dimitris K Iakovidis; Eystratios G Keramidas; Dimitris Maroulis
Journal: Artif Intell Med Date: 2010-04-27 Impact factor: 5.326

2. Automated classification of multispectral MR images using unsupervised constrained energy minimization based on fuzzy logic.

Authors: Geng-Cheng Lin; Chuin-Mu Wang; Wen-June Wang; Sheng-Yih Sun
Journal: Magn Reson Imaging Date: 2010-04-24 Impact factor: 2.546