Literature DB >> 30356422

Local Binary Patterns Descriptor Based on Sparse Curvelet Coefficients for False-Positive Reduction in Mammograms.

Meenakshi M Pawar1, Sanjay N Talbar2, Akshay Dudhane2.   

Abstract

Breast Cancer is the most prevalent cancer among women across the globe. Automatic detection of breast cancer using Computer Aided Diagnosis (CAD) system suffers from false positives (FPs). Thus, reduction of FP is one of the challenging tasks to improve the performance of the diagnosis systems. In the present work, new FP reduction technique has been proposed for breast cancer diagnosis. It is based on appropriate integration of preprocessing, Self-organizing map (SOM) clustering, region of interest (ROI) extraction, and FP reduction. In preprocessing, contrast enhancement of mammograms has been achieved using Local Entropy Maximization algorithm. The unsupervised SOM clusters an image into number of segments to identify the cancerous region and extracts tumor regions (i.e., ROIs). However, it also detects some FPs which affects the efficiency of the algorithm. Therefore, to reduce the FPs, the output of the SOM is given to the FP reduction step which is aimed to classify the extracted ROIs into normal and abnormal class. FP reduction consists of feature mining from the ROIs using proposed local sparse curvelet coefficients followed by classification using artificial neural network (ANN). The performance of proposed algorithm has been validated using the local datasets as TMCH (Tata Memorial Cancer Hospital) and publicly available MIAS (Suckling et al., 1994) and DDSM (Heath et al., 2000) database. The proposed technique results in reduction of FPs from 0.85 to 0.02 FP/image for MIAS, 4.81 to 0.16 FP/image for DDSM, and 2.32 to 0.05 FP/image for TMCH reflecting huge improvement in classification of mammograms.

Entities:  

Mesh:

Year:  2018        PMID: 30356422      PMCID: PMC6178513          DOI: 10.1155/2018/5940436

Source DB:  PubMed          Journal:  J Healthc Eng        ISSN: 2040-2295            Impact factor:   2.682


1. Introduction

Breast cancer is the most common cancer disease among women across worldwide. It is the leading cause of deaths for women suffering from cancer disease in India. It is estimated that breast cancer cases in India would reach to as high as 1,797,900 by 2020 [1]. Rising rate of incidences can cause high mortality. This is due to lack of awareness about breast screening, late reporting, and insufficient medical access [2]. This fact brings a concern and necessity that screening for breast cancer is prudent in its early stage to confirm longer survival. Among all techniques, namely, mammography, tomosynthesis, ultrasonography, computed tomography, and magnetic resonance, mammography is the most reliable and accepted modality by radiologist for preliminary examination of breast cancer due to cost benefits and accessibility [3-5]. The diagnosis of breast cancer using mammogram by radiologist varies from expert to expert as symptoms are misinterpreted or overlooked, due to the tedious task of screening mammograms. Study reveals that 10% to 30% of the visible cancers on mammograms are overlooked, and only 20% to 30% of biopsies are positive [6-8]. Biopsies are traumatic in nature and costly; therefore, computer aided detection and diagnosis (CAD) systems combined with expert radiologists' experience would provide more comprehensive diagnosis [9]. Detailed survey about the research in the design of CAD systems has been given in next section.

2. Literature Survey

The design and development of CAD system is an important progressive area of research for contrast enhancement for better visualization and clarification [10-12], pectoral muscle removal, segmentation for better delineation of region of interest (ROI), extraction of features, and classification [13, 14]. The segmentation method is classified as region based, contour-based, and clustering method [15]. The region and contour-based methods are popularly used by many researchers. Görgel et al. [16] developed Local Seed Region Growing-Spherical Wavelet Transform (LSRG–SWT) algorithm using local dataset and MIAS [17] with classification accuracy of 94% and 91.67%, respectively. Pereira et al. [18] presented segmentation and detection of masses in mammogram using wavelet transform and genetic algorithm that provides FP rate of 1.35 FP/image and sensitivity of 95% using DDSM [19]. Rouhi et al. [20] studied segmentation using region growing, Cellular Neural Network (CNN), and ANN. The result of classification varied from 80 to 96%, which is the main weakness of their study. Berber et al. [21] proposed Breast Mass Contour Segmentation (BMCS) approach and showed 6 FPR for local dataset. Hybrid level set segmentation method [22] based on combination of region growing and level set was used to segment tumor. The results showed that the sensitivity varied from 78 to 100% due to the presence of artifact in the MIAS database. The difficulties in region and contour-based segmentation methods are the appropriate initialization of seed point and contour position. Several researchers have implemented clustering method like K-means and Fuzzy C-means (FCM) for breast abnormality segmentation [3, 23]. However, they have limitations in terms of learning abilities. Learning-based techniques such as Self-organizing map (SOM) [24] have been successfully used in medical image segmentation [25]. The success of SOM in medical image segmentation has inspired the researcher to choose it for mammogram segmentation. Many of the times the tumor-segmented regions are not the abnormal tissues (cancerous region), and they are known as false positives (FPs). This FP consumes much time of radiologists and results into unnecessary biopsies. Thus, reducing the FPs is an open research problem and various researchers have proposed FP reduction algorithms to improve the specificity of the CAD systems [5, 9, 23, 26–31]. Usually, FP reduction algorithm is postprocessing step of a CAD system with two stages namely: Feature extraction and Classification. Various methods have been developed for feature extraction based on wavelets [8, 18, 32], curvelet [33, 34], Gabor [35, 36], morphological descriptors [20], textural analysis [26, 27, 30, 32], histogram [4, 5, 7, 29, 37–40], etc. The segmentation error can reduce the performances of morphological descriptor. When Gray Level Co-occurrence Matrix (GLCM) from normal and abnormal region in dense mammogram is same, texture descriptor overlaps that leads to more number of FPs [37]. Ojala et al. proposed local binary patterns (LBPs) [41] for textural feature extraction which works well in feature extraction as compared to morphological descriptor and GLCM-based textural descriptor. LBP descriptor can be considered as local microstructures, namely, edges, flat areas, spots, etc. Variants of LBP have been proposed by various researchers to achieve rotation and intensity invariant features. Also, LBP is computationally efficient and extracts robust features; therefore, LBP descriptors have been widely applied in FP reduction and classification methods for mammogram images [29, 37, 39, 40]. However, LBP descriptor does not provide the directional information of local micropattern. Therefore, transform technique such as curvelet combined with LBP was used to extract features. Various curvelet-based approaches have been proposed in the literature [8, 33, 34, 42] which conclude that curvelet outperforms as compared to wavelet transform. In this work, novel method of extracting sparse curvelet subband coefficients by incorporating the knowledge of irregular shape of masses as they appear in sparse matrix and calculating LBP features has been presented. Therefore, this paper presents scheme as follows: Preprocessing of mammogram image for contrast enhancement using local entropy maximization-based image fusion algorithm and removal of background noise Cluster-based segmentation of mammograms using SOM and extract tumor regions, i.e., ROI) FP reduction: extraction of sparse curvelet subband coefficients and computation of LBP descriptor to classify true positives and false positives to improve performance of CAD system using MIAS [17], DDSM [19], and Tata Memorial Cancer Hospital (TMCH) datasets. The organization of paper is as follows: Sections 1 and 2 illustrate the introduction and literature review on automatic segmentation and extraction of abnormal masses (i.e., tumor region) as well as FP reduction methods. Section 3 presents the proposed methodology for SOM based segmentation of mammograms followed by novel false positive reduction in detail. Section 4 depicts the experimental results and discussions on three benchmark datasets. Finally, Section 5 concludes the proposed approach for accurate extraction of abnormal masses (i.e., tumor region) by excluding the FPs.

3. Methodology

The block schematic of proposed integrated method for automatic detection of breast cancer using sparse curvelet coefficient-based LBP descriptor has been shown in Figure 1.
Figure 1

Schematic architecture for automatic breast cancer detection.

3.1. Preprocessing

The mammogram images are low-dose x-ray images so they have poor contrast and suffer from noises. The preprocessed mammogram image as shown in Figures 2–2 represents preprocessing of mammogram, and Figures 2–2 represents SOM clustering and ROI extraction.
Figure 2

Steps for mammogram processing (a) enhanced mammogram, (b) binary mask, (c) pectoral removal, (d) pectoral removed mammogram, (e) clustered image, (f) cluster of interest, and (g) ROI extraction.

3.1.1. Local Entropy Maximization-Based Image Fusion: Contrast Enhancement

The contrast enhancement of the mammogram is performed using local entropy maximization [12] for better segmentation. Here, original image is given to the contrast limited adaptive histogram equalization (CLAHE) algorithm to get the second input to our image fusion algorithm. Further, original image along with the CLAHE has been given to the image fusion algorithm. Procedure of the image fusion has been given in Algorithm 1. We have used local entropy as a fusion rule given by the following equation:where ENT is the local entropy and p_org(k) and p_CLAHE(k) are the probability of kth pixel from 5 × 5 sliding window [12]. Here, both high frequency components from original mammogram and CLAHE mammogram have been fused using maximum entropy criteria. Figure 3(b) presents contrast-enhanced mammogram using local entropy maximization-based image fusion.
Algorithm 1

Image fusion for contrast enhancement.

Figure 3

Preprocessing. (a) Original image from MIAS database. (b) Contrast-enhanced mammogram using local entropy maximization. (c) Process of pectoral muscle removal. (d) Pectoral muscle removed mammogram.

3.1.2. Pectoral Muscle Removal

Pectoral muscle suppression has been performed by defining rectangle as suggested in [14] (Figure 3(c)). It illustrates the rectangle (ABDC) and fixes the points G and has intensity variation and joins them for pectoral muscle suppression. Figure 3(d) illustrates pectoral muscle removed image to avoid discrepancies in the algorithm because of similar intensities present between pectoral muscle and masses.

3.2. SOM Clustering

SOM is a special type of neural network designed to map the input image of size N × N to M clusters based on their characteristic features [25]. For SOM, the image (I) is converted into a feature vector f={f1, f2,…, f}, where m is the number of features. In this experiment, we have trained SOM with M = 4 clusters using p=9 neighbourhood features such as given a centre pixel (gc) in the image, the neighbourhood features are computed as given in the following equation:where n is the number of neighbourhood (3 × 3 window), g is the neighbourhoods, and F is the feature vector corresponds to centre pixel gc. The selection of 3 × 3 window pixel is based on [43] to capture local details. At the start, weight vector Wi={wi1, wi2,…, wi} is random and updated as the network learns. The minimum Euclidean distance ‖f − Wi‖ is described as the best matching component or winner node ‖f − Wc‖ and described as Weight vector for winning output neuron and its neighboring neurons are updated aswhere t=1, 2,… is time coordinate. The function Nci(t) is the neighbourhood kernel function and expressed aswhere η(t) is the learning rate, σ(t) is a width of kernel that corresponds to neighbourhood neurons around node c and mc and mi corresponds to location vectors of nodes c and i. Figures 4(a) and 4(b) represent cluster map and cluster boundaries marked on mammogram. After the several observations for known areas, it was empirically noticed that number of pixels of range or pixel level threshold (PLT based on pixel count in TP) as 450 to 31,500; 16,000 to 2,00,000; and 4,000 to 2,00,000 consist of abnormality for MIAS, DDSM, and TMCH database, respectively, which is verified from the expert. The size of the tumor is varying because of the mammogram size of 1024 × 1024 pixels for MIAS, 2728 × 3920 pixels to 4608 × 6048 pixels for DDSM, and 2294 × 1914 or 4096 × 3328 pixels for TMCH datasets. Therefore, cluster regions below or above the specified threshold are discarded and the remaining region is marked as true positive (TP) as shown in Figure 4. Figure 4(a) shows the clustered image using SOM; Figure 4(b) shows the cluster boundaries marked on original image.
Figure 4

FP reduction by thresholding (a) clustered image, (b) clusters boundaries marked on original image, and (c) clusters after thresholding.

We can see that there are many FPs along with TP (marked by pink color) which are reduced using pixel level threshold (PLT based on pixel count in TP) as explained above. Figure 4(c) shows the filtered result using PLT.

3.3. ROI Extraction

After SOM clustering (initial segmentation), the next step is to classify the detected regions into TP and FP by using proposed local sparse curvelet features (LSCF) followed by ANN classifier. To do so, initially, we have extracted ROIs from detected regions by SOM clustering and manually categorized into TP and FP. We collected these ROIs from three different datasets according to their maximum height and maximum width using connected components e.g., region marked in Figure 4(c). Therefore, their patch size is different as shown in Figure 5, ROIs for MIAS, DDSM, and TMCH dataset. Further, these extracted patches have been used to train the ANN for the task of FP reduction.
Figure 5

Variable sizes ROIs from MIAS, DDSM, and TMCH datasets.

3.4. False-Positive (FP) Reduction

After ROI extraction, FP reduction algorithm performs computation of proposed local sparse curvelet features (LSCF) followed by ANN classifier.

3.4.1. Proposed Algorithm

LBP [43] was proposed as LBP descriptor computation at circular neighbourhood which is called as uniform LBP (ULBP) descriptor and expressed aswhere Computation of LBP based on actual shape of mass according to sparse matrix has been shown in Figure 6, where it takes pixels related to shape of mass which are called as foreground pixels and rejects the other pixels called as background pixels. The proposed algorithm uses foreground pixels only for LBP computation, and this will tend to number of pixel reduction in LBP computations. Therefore, identification of foreground and background pixels is an important step which is performed using lookup table approach. The identification of foreground and background pixel is based on number of nonzero pixels in the lookup table, i.e., if count of sliding window nonzero pixels is greater than 2, count(p(i, j)) > 2 is identified as foreground and LBP is estimated. On the other hand, if count of sliding window nonzero pixels is less than 2, count(p(i, j)) < 2 is identified as background and LBP would not be estimated and rejected from lookup table. Nonzero pixels provide actual shape of mass and are taken for LBP computations. Graphical representation of proposed algorithm for LBP descriptor computation using foreground pixels has been given in Figure 7 and the algorithm has been described in Algorithm 2.
Figure 6

Lookup table approach for LBP computation from shape of mass in ROI.

Figure 7

Process for computation of LBP descriptor from shape of mass in ROI. (a) Original image, (b) 3 × 3 window for selection of foreground pixels, (c) lookup table, (d) decision making process, (e) LBP computation from selected foreground pixels.

Algorithm 2

Algorithm for LBP feature computation based on shape of mass in ROI as.

3.4.2. The Fast Discrete Curvelet Transform (FDCT)

The authors [44] have introduced computationally simple and efficient Fast Discrete Curvelet Transform (FDCT). We have preferred wrapping-based FDCT approach in proposed work, as it is faster. The curvelet coefficients CD(j, l, k) represented by scale j, angle l, and spatial location k can be written as Figure 8 illustrates LBP code computation based on sparse curvelet coefficients; ROI decomposes using curvelet transform with scale orientations l of 16° and scale of 2 as the database consists of minimum ROI size of 25 × 22 pixels. Curvelet transform with scale orientations l of 16° and scale of 2 produces 1+16=17 different subbands based on subband division. Further, each curvelet subband coefficients have been represented using lookup table using 3 × 3 sliding window, and if the row in the lookup table identifies foreground coefficient, then LBP is computed with radius R = 1 and P = 8 neighboring pixels as shown in Algorithm 2; total 58 LBP features have been obtained from foreground curvelet subband coefficients. Therefore, total 986 LBP features have been extracted from 17 curvelet subbands. It can be observed from Figure 8, curvelet subbands also provide shape of mass in 16 different directions so that the directional information can be associated with LBP features. Kanadam et al. [3] used concept of sparse ROI; similarly, we have extended it for sparse curvelet subband and LBP features computation.
Figure 8

LBP code computation using sparse curvelet subband coefficients.

3.5. Classification

In this work, we have analyzed extracted ROI from mammogram using normal-abnormal, benign-malignant, and normal-malignant classes with ANN, SVM, and KNN classifiers. The detailed description of ANN classifier has been given in [45, 46]. To evaluate performance of the proposed system, we have used 3-fold cross validation where database is randomly divided into three sets and accuracy is calculated for each set. The final accuracy of the system is average of accuracy of each of three sets. However, it will not be fair to compare 3-fold cross validation result of SVM and KNN classifier with ANN, because ANN classifier is tested on only one set of images (33% for training, 33% for testing, and 33% for validation). Thus, to do fair comparison, we have trained ANN using input layer (986 neuron) over three different sets (which are considered in SVM and KNN) and calculated its average accuracy. Our proposed false positive reduction algorithm illustrates in Figures 9(a)–9(c). Algorithm 3 summarizes flow of the proposed method for FP reduction in mammograms.
Figure 9

(a) FP reduction by clusters marked on original image, (b) FP reduction by thresholding, (c) FP reduction by sparse curvelet coefficient-based LBP, and ANN.

Algorithm 3

Summary of proposed method for FP reduction in mammograms.

4. Experimental Results and Discussions

The proposed method has been tested and validated using three classifiers and three clinical mammographic image datasets.

4.1. Data Sets

4.1.1. Mammographic Image Analysis Society (MIAS) Database

The mini-MIAS [17] database consists of 322 mammograms, each having 1024 × 1024 pixels and annotated like background tissue character, class, severity, center of abnormality, and radius of circle for abnormality. This database includes 64 benign, 51 malignant, and 207 normal cases, which have been taken for experimentation.

4.1.2. Digital Database for Screening Mammography (DDSM)

The DDSM [19] dataset consists of 2500 studies and is composed of cranial-caudal (CC) and mediolateral-oblique (MLO) views of mammographic image for left and right breast, annotated with ACR breast density, type of abnormality, and ground truth. Randomly selected 150 abnormal and 100 normal cases from both HOWTEK and LUMISYS scanner of 12 bits per pixel resolution have been subjected for experimentation.

4.1.3. The Tata Memorial Cancer Hospital (TMCH)

This dataset [47] contains 360 full-field digital mammograms (FFDMs) comprising 180 CC views and 180 MLO views from right and left breast acquired from 90 randomly selected patients. It is composed of 180 verified malignant and 180 normal breast images. It uses biopsy proven breast cancer patients' pathological data approved by the Institutional Research Ethics Committee of Tata Memorial Centre Hospital (TMCH), Mumbai, India. The ground truth marking on each abnormal mammogram is performed manually using the Histopathological Reports (HPR) of the respective patients and expert radiologist from TMCH, Mumbai. Approximately 35 patients are examined using “Hologic Selenia System” (Scanner1) gives 16-bit. The remaining 55 patients were examined with “GE Medical Senograph System” (Scanner2) providing 8-bit true color mammogram image in DICOM format of 4096 × 3328 or 2294 × 1914 pixels each measuring size 50 × 50 μm2.

4.2. Segmentation Evaluation and ROI Extraction

The segmentation using SOM that detects suspicious mass regions is considered as TP whereas from nonmass is taken as FP. From Table 1, it is clear that total suspicious ROI (including TP & FP) of 381 for MIAS, 1343 for DDSM, and 1009 for TMCH have been taken for evaluation our proposed algorithm for FP reduction.
Table 1

Result of SOM segmentation.

Dataset usedResult of SOM clustering and thresholdTPR (true-positive rate) = TP/#lesionsFPPI (false-positive per image) = FP/#images
MassSegmented nonmass (FP)Total (#) images
Segmented (TP)Lost
MIAS1087273322(108/115) = 0.94(273/322) = 0.85
DDSM140101203250(140/150) = 0.93(1203/250) = 4.81
TMCH1728837360(172/180) = 0.95(837/360) = 2.32
From extracted ROIs, the minimum patch size is 25 × 22 pixels whereas the maximum size is 1152 × 1356 pixels. Tables 2 and 3 represent curvelet subband coefficients from 17 subbands, and reduced coefficients based on lookup table approach are used to calculate LBP features. It has been observed during experimentation that the curvelet coefficients on an average are reduced for sparse LBP by 14%, 32%, 33%, and 34% for MIAS, DDSM, TMCH: Scanner1, and TMCH: Scanner2, respectively. It may be noticed that reduction in curvelet coefficients for every ROI is not fixed. It completely depends upon the shape of the ROI as per the sparse matrix. Tables 2 and 3 do not represent exact reduction in pixels for complete database, but they exhibit pixel reduction for sample mammograms.
Table 2

Reduction in curvelet coefficients for sample mammograms from MIAS and DDSM dataset.

Sr. No.MIASDDSM
ROI SizeTotal number of curvelet coefficients from subbandsTotal number of selected curvelet coefficients from subbands% reduction in curvelet coefficientsROI SizeTotal number of curvelet coefficients from subbandsTotal number of selected curvelet coefficients from subbands% reduction in curvelet coefficients
1124 × 1381,03,91177,13325.77192 × 1872,16,7291,68,33322.33
2179 × 1381,50,1231,26,14215.97294 × 2915,18,2673,66,68029.25
351 × 11636,81533,4219.22145 × 2071,81,6631,48,76518.11
483 × 8342,44936,65313.65169 × 1681,71,8731,54,7529.96
584 × 7639,11535,8158.44182 × 2482,72,5172,19,82219.34
674 × 8337,76734,4488.79213 × 3494,49,7833,01,35933.00
753 × 6420,96918,62111.20578 × 41214,33,1956,62,07253.80
870 × 4418,89916,61012.11215 × 2192,86,4292,42,93515.18
980 × 6632,40929,4549.12420 × 42810,82,4615,01,20953.70
1069 × 8636,78333,5528.78203 × 3073,75,8712,66,82929.01
1159 × 11642,01938,4428.51226 × 2623,57,2092,76,76322.52
1281 × 10150,63746,1228.92159 × 1941,87,5631,48,74120.70
1341 × 8421,42718,90711.76718 × 68629,61,1277,62,56174.25
1469 × 14160,64152,42713.54409 × 55013,52,4399,04,82933.10
1560 × 6222,92520,47510.69524 × 37511,84,6717,66,52235.30
1696 × 10159,64754,2359.07311 × 2755,17,4333,97,73723.13
17136 × 1391,13,72766,35241.56319 × 3206,14,1294,12,60632.81
1855 × 9431,35928,3059.74313 × 4478,42,8555,10,48239.43
19157 × 1401,32,3731,07,00019.17291 × 5179,07,9036,37,41229.79
20156 × 1301,23,00790,83426.15370 × 83718,64,8898,98,23651.83
Average58,85048,24714Average7,88,9504,37,43232
Table 3

Reduction in curvelet coefficients for sample mammograms from TMCH Scanner1 and Scanner2 dataset.

Sr. no.TMCH: Scanner 1: “GE Medical Senograph System”TMCH: Scanner 2: “Hologic Selenia System”
ROI sizeTotal number of curvelet coefficients from subbandsTotal number of selected curvelet coefficients from subbands% reduction in curvelet coefficientsROI sizeTotal number of curvelet coefficients from subbandsTotal number of selected curvelet coefficients from subbands% reduction in curvelet coefficients
1459 × 41211,40,6177,94,88530.31291 × 2784,89,2433,17,01435.20
2548 × 51316,93,40311,17,87333.99545 × 2468,09,6275,31,60434.34
3415 × 3037,58,3234,45,58541.24560 × 48316,30,58310,68,43934.47
4645 × 49519,51,44311,45,58041.29782 × 51024,00,13712,75,07346.87
5437 × 69118,16,6519,85,12045.7787 × 14175,77367,87110.43
6812 × 50024,39,93712,87,06547.25311 × 1853,48,5652,40,54630.99
7468 × 37910,66,3337,10,24233.40262 × 3485,50,3032,82,82148.61
8673 × 58223,55,58917,30,91526.52610 × 44016,14,5158,42,87647.79
9250 × 2013,04,5132,35,67022.61949 × 39122,27,20913,59,33838.97
10525 × 48815,44,69111,61,94224.78365 × 3858,46,5236,04,47328.59
11488 × 77922,87,54716,11,38529.56393 × 2475,85,0634,90,47416.17
121434 × 96683,26,58137,42,27755.06341 × 3016,18,1114,10,54233.58
13348 × 4218,81,7016,16,50130.08523 × 70222,06,0978,00,05763.73
14460 × 53014,67,2277,99,88545.48370 × 2846,32,5394,23,72733.01
15398 × 45010,78,4418,28,06423.22344 × 2024,18,9553,02,42727.81
16247 × 2724,04,4013,44,82214.73264 × 1882,99,9972,31,92622.69
17411 × 3057,57,6574,05,91946.42233 × 2473,46,9832,67,68622.85
18286 × 3445,93,1554,48,56624.38370 × 2916,50,7014,86,12525.29
19417 × 2075,23,4774,29,75517.90680 × 48319,79,54310,52,79346.82
20463 × 45812,75,0218,57,60832.74202 × 2663,24,2952,15,04233.69
Average16,33,3359,84,98333Average9,52,7385,63,54334

4.3. Classifier Evaluation and False-Positive Reduction

From Figures 10–13, the best classification accuracy of 98.57 % has been obtained for MIAS in benign versus malignant classification, whereas 98.70% for DDSM, 98.30% for TMCH: Scanner1, and 100% for TMCH: Scanner2 classification accuracies have been obtained in normal versus malignant classification. The classification performance of ANN has improved from 6% to 43% for different databases as compared to KNN classifier, whereas there is little improvement about 7% compared with SVM classifier. The performances of both proposed sparse LBP and LBP computation on curvelet subbands are nearly same; therefore, the proposed algorithm can be efficiently implemented in CAD system with lesser number of curvelet coefficients.
Figure 10

Average classification rate for TMCH dataset.

Figure 11

Average classification rate for MIAS and DDSM dataset.

Figure 12

Average classification rate for MIAS and DDSM dataset.

Figure 13

Average classification rate for MIAS and DDSM dataset.

Data augmentation has been used for some classes to maintain balance between two classes, to improve performance, and to learn more powerful model. Table 4 explains the FP reduction with the use of curvelet-based LBP features and ANN. It has been observed that FP reduced from 0.85 to 0.02 FP/image in MIAS, 4.81 to 0.02 FP/image in DDSM and 2.32 to 0.13 FP/image in TMCH.
Table 4

Number of ROIs resulted in FP reduction using curvelet-based LBP (without sparse) & ANN classification at training and validation stage.

ClassDataset usedBenign/malignant massNonmass/benign massTotal (#) imagesTPR (true-positive rate) = TP/#lesionsFPPI (false-positive per image) = FP/# images
Previous stageSelected (TP)Lost (FN)Previous stageSelected (TN)Lost (FP)
Normal vs abnormalMIAS108 ∗ 2 = 2162031327325716315(203/216) = 0.94(16/315) = 0.05
DDSM140 ∗ 4 = 5604659512031095108240(465/560) = 0.83(108/240) = 0.45

Benign vs malignantMIAS4949059572108(49/49) = 1.00(2/108) = 0.02
DDSM46 ∗ 2 = 9291194913140(91/92) = 0.99(3/140) = 0.02

Normal vs malignantMIAS49 ∗ 4 = 1961841227325419256(184/196) = 0.94(19/256) = 0.07
DDSM46 ∗ 4 = 18418041203114360146(180/184) = 0.98(60/146) = 0.41
TMCH: Scanner1107 ∗ 4 = 4284161260555154217(416/428) = 0.97(54/217) = 0.25
TMCH: Scanner265 ∗ 4 = 260255523221418135(255/260) = 0.98(18/135) = 0.13

∗Augmentation of image.

Similarly, Table 5 shows the reduction in FPs as 0.85 to 0.01 FP/image for MIAS, 4.81 to 0.03 FP/image for DDSM, and 2.32 to 0.00 FP/image for TMCH using sparse curvelet coefficient-based LBP features. The results show the effectiveness of sparse curvelet coefficient-based LBP and ANN. From Table 6, the best value of AUC = 0.99 is obtained in benign versus malignant classification for MIAS, AUC = 0.98 in benign versus malignant in case of DDSM, AUC = 0.94 in normal versus malignant in case of TMCH: Scanner1, and AUC = 0.96 in normal versus malignant classification in TMCH: Scanner2 using ANN and curvelet subband-based LBP features. The worst performance of AUC = 0.53 for MIAS is obtained with the proposed algorithm using KNN classifier as shown in Table 7. Similarly, from Table 7, the best value of AUC = 0.98 is obtained in TMCH: Scanner1, AUC = 1 is obtained in TMCH: Scanner2 database for normal versus malignant classification, AUC = 0.98 in benign versus malignant classification is attained in MIAS database, and AUC = 0.98 is achieved for normal versus malignant classification in DDSM database using ANN classifier for sparse curvelet subband-based LBP features.
Table 5

Number of ROIs resulted in FP reduction using sparse curvelet coefficient-based LBP & ANN classification at training and validation stage.

ClassDataset usedBenign/malignant massNonmass/benign massTotal (#) imagesTPR (true-positive rate) = TP/#lesionsFPPI (false-positive per image) = FP/# images
Previous stageSelected (TP)Lost (FN)Previous stageSelected (TN)Lost (FP)
Normal vs abnormalMIAS108 ∗ 2 = 216201152732658315(201/216) = 0.93(8/315) = 0.02
DDSM140 ∗ 4 = 560516441203115548240(516/560) = 0.92(48/240) = 0.2

Benign vs malignantMIAS4948159591108(48/49) = 0.98(1/108) = 0.01
DDSM46 ∗ 2 = 9289394895140(89/92) = 0.97(5/140) = 0.03
Normal vs malignantMIAS49 ∗ 4 = 196192427325914256(192/196) = 0.98(14/256) = 0.05
DDSM46 ∗ 4 = 18418221203116736146(182/184) = 0.99(36/146) = 0.25
TMCH: Scanner1107 ∗ 4 = 428424460559312217(424/428) = 0.99(12/217) = 0.05
TMCH: Scanner265 ∗ 4 = 26026002322320135(260/260) = 1.00(0/135) = 0

∗Augmentation of image.

Table 6

Performance evaluation of curvelet-based LBP descriptor algorithm.

DatasetClassificationNormal-malignantNormal-abnormalBenign-malignant
ClassifierSensitivitySpecificityAUCSensitivitySpecificityAUCSensitivitySpecificityAUC
MIASANN0.940.930.940.940.940.941.000.970.99
SVM0.850.850.850.830.860.850.880.840.86
KNN0.670.630.650.580.570.580.620.680.63

DDSMANN0.980.950.950.830.910.850.990.970.98
SVM0.970.880.920.710.910.830.940.890.92
KNN0.960.640.870.670.900.800.870.730.79

TMCH: Scanner1ANN0.970.910.94
SVM0.960.910.94
KNN0.980.820.89

TMCH: Scanner2ANN0.980.920.96
SVM0.970.900.94
KNN0.920.830.88
Table 7

Performance evaluation of proposed algorithm.

DatasetClassificationNormal-malignantNormal-abnormalBenign-malignant
ClassifierSensitivitySpecificityAUCSensitivitySpecificityAUCSensitivitySpecificityAUC
MIASANN0.980.950.960.930.970.950.971.000.98
SVM0.880.830.850.850.820.840.840.920.87
KNN0.550.510.530.550.630.560.610.670.61

DDSMANN0.990.970.980.920.960.930.970.950.96
SVM0.990.920.960.890.960.920.940.920.93
KNN0.980.730.920.740.900.820.890.770.83

TMCH: Scanner1ANN0.990.980.98
SVM0.980.960.97
KNN0.990.920.95

TMCH: Scanner2ANN1.001.001.00
SVM1.000.980.99
KNN0.960.920.94
However, from Table 7, it should be noted that the performance of proposed algorithm is the best using ANN classifier. Figure 14 represents automated CAD system for breast cancer diagnosis with sample mammograms.
Figure 14

Representation of fully automatic CAD system for breast cancer using (a) sample mammograms from MIAS, DDSM, and TMCH datasets, (b) preprocessed mammograms, (c) clustered image, (d) TP and FP marked on mammogram, (e) TP marked by thresholding, (f) TP marked by using LBP descriptor based on sparse curvelet coefficients.

Table 8 provides comparative study of methods developed for breast tissue classification. The proposed method provides best results in terms of AUC and reduction of number of FPs as 0.85 to 0.01 FP/image for MIAS, 4.81 to 0.03 FP/image for DDSM, and 2.32 to 0.00 FP/image for TMCH. The earlier reported work uses the fixed patch size-based approach which limits the automatic CAD system scope whereas proposed system provides complete solution to CAD system right from automatic tumor patch segmentation to reduction in FPs and final representation of mammogram with TP marked on it. It will drastically reduce the radiologist work by location tumor directly on mammogram.
Table 8

Comparison of classification accuracy, AUC, and FP/image values from different approaches in breast cancer diagnosis.

AuthorDatabaseMethodClassifierResultAUCFP/image
Eltoukhy et al. [33]MIASBiggest curvelet coefficients as a feature vectorEuclidean classifier94.07%
Eltoukhy et al. [42]98.59
Eltoukhy et al. [8]SVM97.3
Dhahbi et al. [34]Mini-MIASCurvelet momentsKNN91.27
DDSM86.46
Bruno et al. [4]DDSMCurvelet + LBPSVM850.85
PL940.94
da Rocha et al. [40]DDSMLBPSVM88.310.88
Kanadam and Chereddy [3]MIASSparse ROISVM97.42
Pereira et al. [18]DDSMWavelet and Wiener filterMultiple thresholding, wavelet, and GA1.37
Liu and Zeng [29]DDSM, FFDMGLCM, CLBP, and geometric featuresSVM1.48
De Sampaio et al. [39]DDSMLBPDBSCAN98.260.19
Zyout et al. [30]DDSMSecond order statistics of wavelet coefficients (SOSWC)SVM96.80.970.018
MIAS95.296.60.029
Casti et al. [31]DDSMDifferential featuresFisher linear discriminant analysis (FLDA)1.68
MIAS2.12
FFDM0.82
Proposed methodMIASLBP based on sparse curvelet subband coefficientsANN98.570.980.01
DDSM98.700.980.03
TMCH: Scanner198.300.980.05
TMCH: Scanner210010

5. Conclusion

A fully automatic CAD system, which can accurately locate the tumor on a mammogram and reduces FPs, has been proposed. The developed CAD system consists of preprocessing, SOM clustering, ROI extraction, sparse LBP feature computation based on sparse Curvelet coefficients, and finally, FP reduction using ANN classifier. The proposed algorithm presents a novel concept of extraction of curvelet coefficients according to irregular shape of mass is called as sparse curvelet coefficients and computation of LBP. The analysis proves that the FPs are reduced significantly from 0.85 to 0.01 FP/image for MIAS, 4.81 to 0.03 FP/image for DDSM and 2.32 to 0.00 FP/image for TMCH. The ANN classifier showed best results as AUC = 0.98 and accuracy = 98.57% for MIAS in benign-malignant classification, AUC = 0.98 and accuracy = 98.70% for DDSM in normal-malignant classification, AUC = 0.98 and accuracy = 98.30% for TMCH: Scanner1, and AUC = 1 and accuracy = 100% for TMCH: Scanner2 in normal-malignant classification as compared with SVM and KNN classifier. The performance of LBP features and LBP features based on sparse curvelet coefficients are nearly same which show that the proposed algorithm is suitable for cancer breast tissue diagnosis. In future, the reduced curvelet coefficients can be used to extract local ternary patterns and other local descriptor and local directional patterns, etc. The present work deals with mammogram with single mass; this can be further extended for multiple mass models with multiple LBP features based on sparse curvelet coefficients.
  21 in total

1.  A statistical based feature extraction method for breast cancer diagnosis in digital mammogram using multiresolution representation.

Authors:  Mohamed Meselhy Eltoukhy; Ibrahima Faye; Brahim Belhaouari Samir
Journal:  Comput Biol Med       Date:  2011-11-23       Impact factor: 4.589

2.  A textural approach for mass false positive reduction in mammography.

Authors:  X Lladó; A Oliver; J Freixenet; R Martí; J Martí
Journal:  Comput Med Imaging Graph       Date:  2009-04-29       Impact factor: 4.790

3.  A comparison of wavelet and curvelet for breast cancer diagnosis in digital mammogram.

Authors:  Mohamed Meselhy Eltoukhy; Ibrahima Faye; Brahim Belhaouari Samir
Journal:  Comput Biol Med       Date:  2010-02-16       Impact factor: 4.589

4.  Breast cancer diagnosis in digitized mammograms using curvelet moments.

Authors:  Sami Dhahbi; Walid Barhoumi; Ezzeddine Zagrouba
Journal:  Comput Biol Med       Date:  2015-06-26       Impact factor: 4.589

5.  A bilateral analysis scheme for false positive reduction in mammogram mass detection.

Authors:  Yanfeng Li; Houjin Chen; Yongyi Yang; Lin Cheng; Lin Cao
Journal:  Comput Biol Med       Date:  2014-12-16       Impact factor: 4.589

6.  Location of mammograms ROI's and reduction of false-positive.

Authors:  Luis Antonio Salazar-Licea; Jesús Carlos Pedraza-Ortega; Alberto Pastrana-Palma; Marco A Aceves-Fernandez
Journal:  Comput Methods Programs Biomed       Date:  2017-02-24       Impact factor: 5.428

Review 7.  Pectoral muscle segmentation: a review.

Authors:  Karthikeyan Ganesan; U Rajendra Acharya; Kuang Chua Chua; Lim Choo Min; K Thomas Abraham
Journal:  Comput Methods Programs Biomed       Date:  2012-12-25       Impact factor: 5.428

8.  Segmentation and detection of breast cancer in mammograms combining wavelet analysis and genetic algorithm.

Authors:  Danilo Cesar Pereira; Rodrigo Pereira Ramos; Marcelo Zanchetta do Nascimento
Journal:  Comput Methods Programs Biomed       Date:  2014-01-21       Impact factor: 5.428

9.  Breast mass classification on mammograms using radial local ternary patterns.

Authors:  Chisako Muramatsu; Takeshi Hara; Tokiko Endo; Hiroshi Fujita
Journal:  Comput Biol Med       Date:  2016-03-16       Impact factor: 4.589

10.  A review of breast cancer awareness among women in India: Cancer literate or awareness deficit?

Authors:  A Gupta; K Shridhar; P K Dhillon
Journal:  Eur J Cancer       Date:  2015-07-29       Impact factor: 9.162

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.