Literature DB >> 32166073

A Semi-Supervised Method for Tumor Segmentation in Mammogram Images.

Abstract

BACKGROUND: Breast cancer is one of the most common cancers in women. Mammogram images have an important role in the treatment of various states of this cancer. In recent years, machine learning methods have been widely used for tumor segmentation in mammogram images. Pixel-based segmentation methods have been presented using both supervised and unsupervised learning approaches. Supervised learning methods are usually fast and accurate, but they usually use a large number of labeled data. Besides, providing these samples is very hard and usually expensive. Unsupervised learning methods do not require the labels of the training data for decision making and they completely ignore the prior knowledge that may lead to a low performance. Semi-supervised learning methods which use a small number of labeled data solve the problem of providing the high number of samples in supervised methods, while they usually result in a higher accuracy in comparison to the unsupervised methods.
METHODS: In this study, we used a semisupervised method for tumor segmentation in which the pixel information is used for the classification. The static and gray level run length matrix features for each pixel are considered as the features, and Fisher discriminant analysis (FDA) is used for feature reduction. A cotraining algorithm based on support vector machine and Bayes classifiers is proposed for tumor segmentation on MIAS data set. RESULTS AND
CONCLUSION: The results show that the proposed method outperforms both supervised methods. Copyright:

Entities: Chemical

Keywords: Bayes classifier; co-training algorithm; mammogram images; support vector machine classifier; tumor segmentation

Year: 2020 PMID： 32166073 PMCID： PMC7038743 DOI： 10.4103/jmss.JMSS_62_18

Source DB: PubMed Journal: J Med Signals Sens ISSN： 2228-7477

Introduction

Breast cancer is the most common cancer in women worldwide. According to the study done in 2003 by the American Cancer Society, about 12% of U. S. women had breast cancer over the course of their lifetime. In European countries, breast cancer accounts for 24% of all types of cancer and 19% of cancer deaths. In 2010, the Iran's Ministry of Health announced that more than 7000 women were diagnosed with breast cancer and more than 4000 people died because of that every year.[1] Automatic detection of disease from medical images forms the major part of researchers in machine learning and medical engineering fields.[2] Mammography is one of the most commonly used tests for breast cancer diagnosis. Today, several methods have been proposed for tumor segmentation in mammography images.[34] A review of automatic tumor segmentation methods for breast cancer has been provided in the literatures.[567] These methods can be considered as six groups: Contour-based segmentation approaches such as active contour algorithm[8910] Region-based growing segmentation techniques[111213141516] Segmentation using two-dimensional discrete wavelet transform[1718] Segmentation based on watershed algorithm[192021] Segmentation with co-occurrence matrix[222324] Classification-based segmentation including supervised and unsupervised learning methods.[2526272829] Tao et al. proposed a classifier for tumor segmentation,[25] in which region of interest (ROI) is divided into some sub-regions and machine learning techniques used for labeling each sub-region. Graph-cut algorithm and optimization method were used for final segmentation. Dynamic programming method has also been used for segmentation.[2627] In Song et al.'s study,[26] first, a plane-fitting method was used to extract the ROI, and then optimal contour of the mass was extracted using a dynamic programming approach. In Song et al.'s study,[27] a similar method was used for tumor segmentation, in which a template matching method was used along with the dynamic programming approach. The results showed that the template matching approach outperforms plane-fitting in this area. Supervised methods are usually fast and accurate, but they need a large number of labeled data. Providing labels for the data is very hard and expensive.[13031] Unsupervised methods do not use labels for decision-making and they may lead to poor performance because they do not use prior knowledge of the samples.[313233] Clustering methods have also been used for pixel labeling in mammogram images in the studies by Shi et al. and Kamil and Salih.[2829] Skin–air boundary estimation using gradient weight map and pectoral-breast boundary detection using clustering approach were done in a study by Shi et al.[28] A texture filter was used for final detection. In a study by Kamil and Salih,[29] K-means and fuzzy C-mean were used for tumor segmentation. To improve the performance, lazy snapping algorithm was used as an additional step. Semi-supervised learning is motivated by the fact that providing unlabeled data is easy and therefore it can be used to improve the accuracy of classifiers. Semi-supervised methods such as self-training algorithm and co-training algorithm dominate the problem of providing the high number of samples in supervised methods because they need a small set of labeled samples. These methods have a higher accuracy in comparison to the unsupervised methods.[131] Some methods in breast tumor segmentation used intensity feature of each pixel to segment the tumor from medical images. It can be seen that texture features have only been used for the diagnosis of benign and malignant breast tumor.[343536] In this study, a semi-supervised method is proposed for tumor segmentation in mammography images. A co-training algorithm is used for the segmentation according to pixel-based features.[1] The study is organized as follows: in section “Basics”, basic concepts including feature extraction and reduction methods and co-training algorithm are presented. The proposed co-training algorithm is presented in section “The proposed method”. In section “Experimental results”, the experimental results and evaluation of the proposed approach in comparison to the supervised methods are presented. Finally, the paper is concluded in section “Conclusions”.

Basics

In this section, pixel-based features used in the proposed method are described. The dimensionality of the features is reduced according to the Fisher discriminant analysis (FDA) method, which is described in the following sections in details. Then, the co-training method is described in the last part.

Feature extraction

In this study, we have used two methods for feature extraction: static features and gray level run length matrix (GLRLM) features. For each pixel in the ROI, we have used a 5 × 5 window for feature extraction as in the study by Azmi et al.[1] Statistical features have been obtained using static methods. These features are mean, variance, absolute deviation, and standard deviation. Run-length matrix is defined in such a way that each element (i,j) of the matrix shows the number of runs with pixels of gray level intensity equal to i and length of run equal to j beside a particular direction. In this study, four directions have been used: 0°, 45°, 90°, and 135° as shown in Figure 1.

Figure 1

Direction of run-length matrix

Direction of run-length matrix The GLRLM features are obtained using Eqs. (1) to (4). Short-run emphasis: Long-run emphasis: Gray-level nonuniformity: Run-length nonuniformity In the above equations, I(i,j) is defined as the number of runs with pixels of the gray level i and the run length j. n is also the total number of runs. There are 11 features [Appendix A] obtained from run-length matrix in each direction including θ = 0°, 45°, 90°, and 135°,[1] and therefore, the number of these features is 44 which are extracted from GLRLM. Moreover, we use four features obtained from the static method for each pixel. Hence, the number of features is 48 for each pixel.

Appendix A

The list of features used for each pixel

Statistic features

1. Mean

2. Variance

3. Absolute deviation

4. Standard deviation

Run-length matrix

1. SRE

2. LRE

3. GLN

4. RLN

5. RP

6. LGRE

7. HGRE

8. SRLGE

9. SRHGE

10. LRLGE

11. LRHGE

SRE – Short-run emphasis; LRE – Long-run emphasis; GLN – Gray-level nonuniformity; RLN – Run-length nonuniformity; RP – Run percentage; LGRE – Low gray-level-run emphasis; HGRE – High gray-level-run emphasis; SRLGE – Short-run low gray-level emphasis; SRHGE – Short-run high gray-level emphasis; LRLGE – Long-run low gray-level emphasis; LRHGE – Long-run high gray-level emphasis

Feature reduction

FDA is a popular method for linear supervised dimension reduction. In this step, the dimension of the extracted features is reduced using the FDA by reducing the within-class scatter and increasing between-class scatter. FDA is closely related to the principal component analysis (PCA), which is based on linear transformations. This method has some properties: it minimizes the mean square error in data compression, finds mutually orthogonal directions in the data with maximum variances, and reduces the correlation of the data using orthogonal transformations. In data compression, PCA finds a smaller dimensional linear representation of the data vectors, so that the reconstruction of the original data can be done with minimum square error.[37] PCA does not consider differences within the class. However, in the FDA, the transformation is based on maximizing a ratio of between-class variance to within-class variance. The goal is to decrease the variation of data in the same class and to increase this measure between the classes. Figure 2 depicts an example of the transformation in the FDA.

Figure 2

An example for Fisher discriminant analysis: (a) The data before transformation and (b) the same data after transformation

An example for Fisher discriminant analysis: (a) The data before transformation and (b) the same data after transformation Figure 2a depicts samples of two classes (shown in different colors) and the histograms, which results from a projection to a line connecting the class means. There is an overlapping area in the projected space. Figure 2b shows the equivalent projection based on the FDA, which shows an improvement on the class separation.[38]

Semi-supervised learning

Semi-supervised learning is a kind of supervised learning techniques, which uses both labeled and unlabeled data for training. Training set is usually composed of a small number of labeled data and a large number of unlabeled data. Semi-supervised methods use a few number of labeled data and therefore they can dominate the problem of providing the high number of samples in supervised methods, but have a higher accuracy compared to unsupervised methods. Unlabeled data, combined with a small amount of labeled data, can result in a significant improvement on learning accuracy. Acquisition of the labels for the data usually needs an expert (e.g., to transcribe an audio segment) or a physical experiment (for example, by determining the three-dimensional structure of a protein or by determining the presence of oil in a particular location). Providing a fully labeled training set may be infeasible due to the cost of this process. Therefore, semi-supervised learning methods can be useful with great practical significance. In this study, we use a co-training algorithm for tumor segmentation, which is described in the following section in details. Co-training algorithm has been introduced by Blum and Mitchell in 1998.[39] In the algorithm, there are two classifiers which are trained using a small set of labeled data using two views. Then, each classifier classifies unlabeled data, selects a limited unlabeled samples whose labels are more reliably predicted, and adds these samples to the training set. Classifiers are retrained and the process is repeated.[40] The mechanism of the algorithm is shown in Figure 3. At the beginning, the two classifiers are trained using limited labeled data and then make a decision for limited unlabeled data. In view 1, Learner 1 and in view 2, Learner 2 make a decision for constant unlabeled limited data independently. Then, the new labeled data are considered as secondary samples which are added to the primary training data for the other classifier as shown in Figure 3. In other words, when Learner 1 makes a decision for a constant unlabeled data set, these new labeled data are used as a secondary training data set for Learner 2 and vice versa. After this stage, the classifiers make a decision for the new test data set which are unlabeled.

Figure 3

The co-training algorithm

The Proposed Method

The proposed method is shown in Figure 4, in which the co-training algorithm is used for tumor segmentation in mammogram images. Two different expert radiologists have extracted the ROI. A sample of ROI is shown in Figure 5.[414243] At the first stage, two feature sets have been extracted for each pixel of training images, and then, the features have been reduced by the FDA method as shown in Figure 4. Then, an image is randomly selected as labeled training data and is given to a radiologist for manual segmentation.

Figure 4

Tumor segmentation procedure, according to the co-training algorithm

Figure 5

(a) An example of region of interest for a test image, (b) the output of Bayes method, (c) the output of support vector machine method, and (d) the output of co-training algorithm

Tumor segmentation procedure, according to the co-training algorithm (a) An example of region of interest for a test image, (b) the output of Bayes method, (c) the output of support vector machine method, and (d) the output of co-training algorithm In our proposed method, the two classifiers used in co-training method are support vector machine (SVM) classifier and Bayes classifier. A few labeled data are extracted to train the classifiers, while the dimensionality of the features is reduced by the FDA. Then, a set of unlabeled data is given to each classifier. The output of a classifier provides the secondary data set which is used for the other classifier. In the test step, each classifier makes a decision for all pixels in the test image and the accuracy of the classifier is calculated. The labels of the pixels are determined according to the classifier which has a higher accuracy. We have used two classifiers for the decision-making. A label corresponding to the output of the classifier with a higher accuracy is considered as a true label for each pixel.

Experimental Results

In this study, we have used the MIAS data set which is available at http://peipa.essex.ac.uk/info/mias.html. The data set contains breast mammography images and their ground truth (GT) segmentation which have been manually extracted by a radiologist. In experiments, GT has been used as a reference for performance evaluations. Here, we used two images for the training process: one labeled image and one unlabeled image. Then, 500 pixels (250 samples from the suspicious abnormal regions and 250 samples from the normal regions) are chosen from the labeled image. The same number of samples of the two classes has been used to train the classifiers. Furthermore, 6000 pixels have been selected from the nondeterministic labeled image. The output of the classifiers for these pixels is considered as the new sample. According to Figure 3, new samples are added to train data. Hereby, we have a total of 500 labeled and 6000 nondeterministic labeled pixels for training of the classifier. In our experiments, 30 images have been used as a test set. To improve the performance, a pixel-based semi-supervised classification method has been used based on texture analysis.[1] In fact, according to this method, the results are reported according to all the pixels of the 30 images. Figure 5a shows a sample of ROI which is extracted from a test image. The output of Bayes classifier and SVM classifier is shown in Figure 5b and c, respectively. The output of co-training algorithm to segment the tumor from the ROI is shown in Figure 5d. Figure 6 shows receiver operating characteristic (ROC) curves for the Bayes classifier, SVM classifier, and co-training method. The performance of the classifiers has been reported using ROC analysis, which is based on statistical decision theory. It has been widely used for the assessment of clinical performance. We compare the performance of supervised learning for two classifiers and semi-supervised learning method proposed in this study. It is clear that the co-training method outperforms the other methods.

Figure 6

Receiver operating characteristic curve of compare performance supervised learning and semi-supervised learning method

Receiver operating characteristic curve of compare performance supervised learning and semi-supervised learning method The following measures have been used for the evaluation: Accuracy: this criterion has been applied to measure the similarity among assigned labels by the proposed method and the true labels: Positive predictive value (PPV): PPV is the percentage of correct prediction of tumor labels and correctly classified on the basis of the test result as positive (tumor): Negative predictive value (NPV): NPV is the percentage of correct prediction of nontumor labels and correctly classified on the basis of the test result as negative: Sensitivity: the percentage of tumor prediction recognized by the test is: Specificity: the percentage of nontumor prediction recognized by the test is: The ROC has been used to evaluate the accuracy of the system. The area under the curve that is called Az is a measure of the success of the system. The output of the proposed co-training method is compared with watershed segmentation[40] and region-growing[4] approach for five images as an example in Table 1.

Table 1

The comparison between the proposed co-training algorithm and watershed and also region growing segmentation test images

Accuracy (%)	Co-training method	Watershed segmentations	Region growing
Test image 1	91.68	90.05	89.80
Test image 2	92.67	91.35	91.78
Test image 3	89.52	82.12	79.22
Test image 4	91.00	92.28	90.58
Test image 5	76.19	79.01	79.81

The comparison between the proposed co-training algorithm and watershed and also region growing segmentation test images Table 2 reports that when limited labeled data are used in the classifiers, the accuracy is 43.17% for SVM 87.52% for Bayes. The accuracy is 94.04% for the co-training algorithm when we use the same limited labeled data.

Table 2

Comparison of performance of the supervised learning and the semi-supervised learning methods

Learning approaches	Supervised learning method			Semi-supervised learning method (co-training algorithm)
Test evaluation using different training samples	Labeled data	Unlabeled data	Test data	Labeled data	Unlabeled data	Test data
Test evaluation using different training samples	200	0	27,054	200	450	27,054
Learning algorithm	SVM	Bayes		Co-training
Accuracy (%)	43.17	87.52		94.04

SVM – Support vector machine

Comparison of performance of the supervised learning and the semi-supervised learning methods SVM – Support vector machine We can compare the output of supervised and semi-supervised learning methods. The average performance of the method for 30 images is shown in Table 3 for SVM and Bayes as supervised classifiers and the proposed co-training method as a semi-supervised classifier. It can also be seen that the average performance for the co-training algorithm is higher than supervised methods.

Table 3

The performance of supervised and semi-supervised methods according to mean and standard deviation

	SVM		Bayes		Co-training

	Mean (%)	SD	Mean (%)	SD	Mean (%)	SD
Accuracy	79.56	7.75	78.67	7.80	80.54	7.77
PPV	83.73	20.53	89.11	16.69	87.19	17.67
NPV	65.34	29.67	55.89	30.47	63.37	29.71
Sensitivity	83.00	9.87	78.85	10.05	82.14	9.98
Specificity	80.10	11.99	84.35	9.80	83.67	9.89
A-z	0.76	0.07	0.77	0.06	0.78	0.07

SVM – Support vector machine; SD – Standard deviation; PPV – Positive predictive value; NPV – Negative predictive value

The performance of supervised and semi-supervised methods according to mean and standard deviation SVM – Support vector machine; SD – Standard deviation; PPV – Positive predictive value; NPV – Negative predictive value

Conclusions

In this study, a semi-supervised learning method is proposed for tumor segmentation from mammogram images. It was shown that using the co-training algorithm for tumor segmentation has a higher accuracy than the supervised methods. The advantage of the proposed method is that it does not require a large number of data for classification and hence it is computationally tractable. As a disadvantage of the method, it can be mentioned that the accuracy of the proposed method is low on low quality images like the other methods. The main reason is that there is no knowledge about the true labels of the secondary training data; therefore, the output of the classifier may be biased to one of the classes. Future studies include using more than two classifiers in co-training. Probability rules can also be used for label prediction and therefore it may improve the performance of the co-training method. Additional knowledge about the secondary training data can also be used to prevent the classifiers to be biased. It can improve the accuracy of the co-training method which can be studied as our future study.

Financial support and sponsorship

None.

Conflicts of interest

There are no conflicts of interest.

BIOGRAPHIES

Hanie Azary received B.Sc. degree in computer engineering from Payam Noor University of Tehran in 2011 and received the M.Sc. degree in Artificial Intelligence from Iran University of Science and Technology, Tehran, Iran in 2015. Her research interests include Medical Image Processing, Machine vision and Pattern recognition. Email: hanie.azary@yahoo.com Monireh Abdoos received her Ph.D. in Artificial Intelligence from Iran University of Science and Technology in 2013. She is currently an assistant professor at Faculty of Computer Science and Engineering in Shahid Beheshti University, Tehran, Iran. Her research interests include: Multi-agent Systems, Machine learning and Intelligent Transportation Systems. Email: m_abdoos@sbu.ac.ir

22 in total

1. Neural network-based segmentation of dynamic MR mammographic images.

Authors: Robert Lucht; Stefan Delorme; Gunnar Brix
Journal: Magn Reson Imaging Date: 2002-02 Impact factor: 2.546

2. A concentric morphology model for the detection of masses in mammography.

Authors: Nevine H Eltonsy; Georgia D Tourassi; Adel S Elmaghraby
Journal: IEEE Trans Med Imaging Date: 2007-06 Impact factor: 10.048

3. Computer assistance for MR based diagnosis of breast cancer: present and future challenges.

Authors: Sarah Behrens; Hendrik Laue; Matthias Althaus; Tobias Boehler; Bernd Kuemmerlen; Horst K Hahn; Heinz-Otto Peitgen
Journal: Comput Med Imaging Graph Date: 2007-03-21 Impact factor: 4.790

4. Breast mass segmentation in mammography using plane fitting and dynamic programming.

Authors: Enmin Song; Luan Jiang; Renchao Jin; Lin Zhang; Yuan Yuan; Qiang Li
Journal: Acad Radiol Date: 2009-04-10 Impact factor: 3.173

Review 5. A review of automatic mass detection and segmentation in mammographic images.

Authors: Arnau Oliver; Jordi Freixenet; Joan Martí; Elsa Pérez; Josep Pont; Erika R E Denton; Reyer Zwiggelaar
Journal: Med Image Anal Date: 2009-12-29 Impact factor: 8.545

6. Breast mass contour segmentation algorithm in digital mammograms.

Authors: Tolga Berber; Adil Alpkocak; Pinar Balci; Oguz Dicle
Journal: Comput Methods Programs Biomed Date: 2012-12-25 Impact factor: 5.428

7. A comparison of two methods for the segmentation of masses in the digital mammograms.

Authors: R B Dubey; M Hanmandlu; S K Gupta
Journal: Comput Med Imaging Graph Date: 2009-10-13 Impact factor: 4.790

8. Identification of masses in digital mammogram using gray level co-occurrence matrices.

Authors: A Mohd Khuzi; R Besar; Wmd Wan Zaki; Nn Ahmad
Journal: Biomed Imaging Interv J Date: 2009-07-01

9. Introducing kernel based morphology as an enhancement method for mass classification on mammography.

Authors: Azardokht Amirzadi; Reza Azmi
Journal: J Med Signals Sens Date: 2013-04

10. Ensemble Semi-supervised Frame-work for Brain Magnetic Resonance Imaging Tissue Segmentation.

Authors: Reza Azmi; Boshra Pishgoo; Narges Norozi; Samira Yeganeh
Journal: J Med Signals Sens Date: 2013-04

3 in total

1. Lung Infection Segmentation for COVID-19 Pneumonia Based on a Cascade Convolutional Network from CT Images.

Authors: Ramin Ranjbarzadeh; Saeid Jafarzadeh Ghoushchi; Malika Bendechache; Amir Amirabadi; Mohd Nizam Ab Rahman; Soroush Baseri Saadi; Amirhossein Aghamohammadi; Mersedeh Kooshki Forooshani
Journal: Biomed Res Int Date: 2021-04-15 Impact factor: 3.411

2. Weakly supervised segmentation of tumor lesions in PET-CT hybrid imaging.

Authors: Marcel Früh; Marc Fischer; Andreas Schilling; Sergios Gatidis; Tobias Hepp
Journal: J Med Imaging (Bellingham) Date: 2021-10-13

Review 3. Semi-supervised learning in cancer diagnostics.

Authors: Jan-Niklas Eckardt; Martin Bornhäuser; Karsten Wendt; Jan Moritz Middeke
Journal: Front Oncol Date: 2022-07-14 Impact factor: 5.738

3 in total