Literature DB >> 23858386

A gamma-gaussian mixture model for detection of mitotic cells in breast cancer histopathology images.

Adnan Mujahid Khan1, Hesham Eldaly, Nasir M Rajpoot.   

Abstract

UNLABELLED: In this paper, we propose a statistical approach for mitosis detection in breast cancer histological images. The proposed algorithm models the pixel intensities in mitotic and non-mitotic regions by a Gamma-Gaussian mixture model (GGMM) and employs a context aware post-processing (CAPP) in order to reduce false positives. Experimental results demonstrate the ability of this simple, yet effective method to detect mitotic cells (MCs) in standard H & E breast cancer histology images. CONTEXT: Counting of MCs in breast cancer histopathology images is one of three components (the other two being tubule formation, nuclear pleomorphism) required for developing computer assisted grading of breast cancer tissue slides. This is very challenging since the biological variability of the MCs makes their detection extremely difficult. In addition, if standard H & E is used (which stains chromatin rich structures, such as nucleus, apoptotic, and MCs dark blue) and it becomes extremely difficult to detect the latter given the fact that former two are densely localized in the tissue sections. AIMS: In this paper, a robust MCs detection technique is developed and tested on 35 breast histopathology images, belonging to five different tissue slides. SETTINGS AND
DESIGN: Our approach mimics a pathologists' approach to MCs detections. The idea is (1) to isolate tumor areas from non-tumor areas (lymphoid/inflammatory/apoptotic cells), (2) search for MCs in the reduced space by statistically modeling the pixel intensities from mitotic and non-mitotic regions, and finally (3) evaluate the context of each potential MC in terms of its texture.
MATERIALS AND METHODS: Our experimental dataset consisted of 35 digitized images of breast cancer biopsy slides with paraffin embedded sections stained with H and E and scanned at × 40 using an Aperio scanscope slide scanner. STATISTICAL ANALYSIS USED: We propose GGMM for detecting MCs in breast histology images. Image intensities are modeled as random variables sampled from one of the two distributions; Gamma and Gaussian. Intensities from MCs are modeled by a gamma distribution and those from non-mitotic regions are modeled by a gaussian distribution. The choice of Gamma-Gaussian distribution is mainly due to the observation that the characteristics of the distribution match well with the data it models. The experimental results show that the proposed system achieves a high sensitivity of 0.82 with positive predictive value (PPV) of 0.29. Employing CAPP on these results produce 241% increase in PPV at the cost of less than 15% decrease in sensitivity.
CONCLUSIONS: In this paper, we presented a GGMM for detection of MCs in breast cancer histopathological images. In addition, we introduced CAPP as a tool to increase the PPV with a minimal loss in sensitivity. We evaluated the performance of the proposed detection algorithm in terms of sensitivity and PPV over a set of 35 breast histology images selected from five different tissue slides and showed that a reasonably high value of sensitivity can be retained while increasing the PPV. Our future work will aim at increasing the PPV further by modeling the spatial appearance of regions surrounding mitotic events.

Entities:  

Keywords:  Breast cancer grading; histopathology image analysis; mitotic cell detection; statistical modeling of mitotic cells

Year:  2013        PMID: 23858386      PMCID: PMC3709430          DOI: 10.4103/2153-3539.112696

Source DB:  PubMed          Journal:  J Pathol Inform


INTRODUCTION

Counting of mitotic cells (MCs) in breast histopathology images is one of three components (the other two being tubule formation, nuclear pleomorphism) required for developing computer assisted grading of breast cancer tissue slides.[1] This is very challenging since the biological variability of the MCs makes their detection extremely difficult [Figure 1]. In addition, if standard H & E is used (which stains chromatin rich structures, such as nucleus, apoptotic cells, and MCs dark blue) and it becomes extremely difficult to detect the later given the fact that former two are densely localized in the tissue sections. As a consequence, two categories of relevant works have been reported in literature. One that use an additional stain (e.g., PHH3) to stain MCs exclusively and detect exclusively stained MCs in the images.[2] Other that use a video sequence to detect MCs over time by incorporating spatio-temporal information.[3] Since the exclusive stain costs additionally and videos are not at all used in standard histopathological practices, therefore a gap exists in the literature.
Figure 1

How hard is it to identify mitotic cells in breast?

How hard is it to identify mitotic cells in breast? In this paper, a robust MCs detection technique is developed and tested on 35 breast histopathology images, belonging to five different tissue slides. To the best of our knowledge, there is no existing method in the literature for detection of MCs in standard H and E, breast histology images. The proposed method mimics a pathologist’s approach to MCs detection under microscope. The main idea is to isolate tumor region from non-tumor areas (lymphoid/inflammatory/apoptotic cells) and search for MCs in the reduced space by statistically modeling the pixel intensities from mitotic and non-mitotic regions. In order to further enhance the positive predictive value (PPV), context aware post-processing (CAPP) has been introduced. The experimental results show that the proposed system achieves a high sensitivity of 0.82 with PPV of 0.29. Employing CAPP on these results produce 241% increase in PPV at the cost of lesser than 15% decrease in sensitivity.

THE PROPOSED ALGORITHM

Stain Normalization

Tissue staining is commonly used to highlight distinct structures in histology images. Among many different stains, H & E is one of the most commonly used. It selectively stains nuclei structures blue and cytoplasm pink. Although staining enables better visualization of tissue structures; however, due to non-standardization in histopathological work flow, stained images vary a lot in terms of color, and intensity. Stain normalization is used to achieve a consistent color and intensity appearance. We found the algorithm proposed by Magee et al.[4] very effective for normalizing histology images.

Tumor Segmentation

Breast cancer histology images can be divided into two regions: tumor and non-tumor. MCs may exist in both tumor and no-tumor regions howeve only those MCs are considered for grading that are present in tumor regions. Therefore, an intelligent MCs detection system must first remove non-tumor areas from the tissue slide in order to minimize the search space. We have used a feature based texture segmentation frame-work random projections with ensemble clustering[5] to segment tumor regions. Broadly, the algorithm follows the following pipeline: (1) a library of texture features is computed over a range of scales and orientations, (2) low dimensional embedding (using random projections) is performed to avoid overfitting and curse of dimensionality, and finally (3) tumor segmentation is performed in low dimensional space. This produces an accurate and totally unsupervised tumor segmentation. In order to account MCs present on the boundary of tumor and non-tumor regions, morphological dilation on tumor segmentation results is performed. Although it increases the chances of detecting boundary MCs, yet it also includes some lymphoid/inflammatory cells into the tumor regions, that appear as false positives (FPs) when detecting MCs in breast histology slides.

Statistical Modeling of MCs

MCs appear as relatively dark, jagged, and irregularly textured structures [Figure 1]. Owing to sectioning artifacts, some appear too dim to notice with a naked eye. In terms of shape, color and textural characteristics, lymphoid/inflammatory cells and apoptotic cells that are densely present in tissue slides possess almost similar characteristics; thus, could easily be confused with MCs. In this paper, we propose gamma-gaussian mixture model (GGMM) for detecting MCs in breast histology images. Image intensities (L channel of La*b* color space) are modeled as random variables sampled from one of the two distributions; gamma-gaussian. Intensities from MCs are modeled by a Gamma distribution and those from non-mitotic regions are modeled by a Gaussian distribution. The choice of gamma-gaussian distribution is mainly due to the observation that the characteristics of the distribution match well with the data it models [Figure 2].
Figure 2

Marginal distributions (vertical bars) and fitted models (solid lines) by the two-component gamma-gaussian mixture model

Marginal distributions (vertical bars) and fitted models (solid lines) by the two-component gamma-gaussian mixture model

GGMM

Figure 2 shows two marginal distributions (solid lines) and their fitted models (dotted lines). The left and the right marginal distributions show the probability distributions of pixels belonging to mitotic and non-mitotic regions respectively. Close fit to the marginal distributions was achieved by GGMM. The GGMM is a parametric technique for estimating probability density function. In our context, it can be formulated as follows. For pixel intensities x, the proposed mixture model is given by: where ρ1 and ρ2 represent the mixing proportions (priors) of intensities belonging to mitotic and non-mitotic regions, and ρ1 + ρ2 = 1. Γ (x, α, β ) represents the gamma density function parameterized by α (the shape parameter) and β (the scale parameter). G (x, μ, σ) represents Gaussian density function parameterized by μ (mean) and σ (standard deviation). θ = [α, β, μ, σ, ρ1, ρ2] represents the vector of all unknown parameters in the model.

Parameter Estimation

In order to estimate unknown parameters (θ), we employ maximum likelihood estimation (MLE). Given image intensities x i, i = 1, 2,…., n where n is number of pixels, log-likelihood function (l) of parameter vector θ is given by where f (xi ; θ) is the mixture density function in equation (1). The MLE of θ can be represented by A convenient approach to obtain a numerical solution to the above maximization problem is provided by the expectation maximization (EM) algorithm.[6] In our context, the EM algorithm can be set up as follows. Let zik, k = 1, 2, be indicator variables showing the component membership of each pixel xi in the mixture model. (1) Note that these indicator variables are hidden (unobserved). The log-likelihood (2) can be extended as follows: The EM algorithm finds iteratively as outlined in [Algorithm 1. Let θ(m) be the estimate of θ after m iterations of the Algorithm 1. The EM algorithm seeks to find the MLE of the marginal likelihood by iteratively applying Expectation and Maximization steps.

Classification

The posterior probabilities of a pixel xi belonging to class 1 (Mitotic) or 2 (Non-Mitotic) are calculated as follows, Given the pixel-wise posterior probability maps, Otsu thresholding is then used to classify mitotic and non-mitotic pixels. It was found empirically that the area of MC was between 60 and 1,000 pixels. Therefore, area thresholding is performed to remove all potentially mitotic regions having area out of this range.

CAPP

The results produced as a result of the algorithmic steps stated so far achieve 86% sensitivity, however given a large no of similar looking objects (apoptotic cells, lymphoid/inflammatory cells, etc), a number of FPs are also obtained. In order to reduce the FPs without significantly reducing sensitivity, CAPP is performed on the classification results. A small context window [Figure 3] is defined around the bounding box of each potentially MC. In each context window, four representative features are computed over a set of textural features. The representative features are used to train a support vector machine (SVM) classifier using a Gaussian kernel. The trained classifier is then used to predict unseen candidate contexts of mcs.
Figure 3

Four examples of 50 × 50 context patches, cropped around the bounding box of candidate MCs (detected using the proposed algorithm). First 2 (from left) are false positives, last 2 are MCs

Four examples of 50 × 50 context patches, cropped around the bounding box of candidate MCs (detected using the proposed algorithm). First 2 (from left) are false positives, last 2 are MCs

RESULTS

Our experimental dataset consisted of 35 digitized images of breast cancer biopsy slides with paraffin embedded sections stained with H and E and scanned at × 40 using an Aperio ScanScope slide scanner. After stain normalization, background removal and unsupervised tumor segmentation over all 35 images, seven images were selected to extract mitotic and non-mitotic pixel intensities (L channel of LaFNx01bFNx01 color space) for model fitting using GGMM. We chose 500 iterations and tolerance (f = 0.01) for the EM algorithm. Although EM provides estimates of priors (ρ1 and ρ2), a more accurate estimate of priors (ρ1 = 0.0014 and ρ2 = 0.9986) was used based on the ratio of mitotic and non-mitotic data used for model fitting. Figure 4 shows the plot of senstivity against PPV when area-threshold is varied on the candidate MCs.
Figure 4

Plot of sensitivity versus positive predictive value (PPV) when area-threshold is varied on the candidate mitotic cells. High sensitivity and low PPV is obtained when small values of area-threshold were used. Table 1 shows how introduction of CAPP appreciates PPV without significantly degrading sensitivity

The set of textural features extracted from a window of size 30 × 30 pixels around the bounding box of each candidate mitosis are as follows: 32 Phase Gradient (PG) features (16 orientations, 2 scales),[7] 1 roughness feature, 1 entropy feature. From each of these 34 features, 4 representative features were computed: (1) mean, (2) standard deviation, (3) skewness, (4) kurtosis. This gave a 136-dimensional features vector for each pixel inside the context window. The resulting 136 dimensional vector was used in training and testing of SVM. Plot of sensitivity versus positive predictive value (PPV) when area-threshold is varied on the candidate mitotic cells. High sensitivity and low PPV is obtained when small values of area-threshold were used. Table 1 shows how introduction of CAPP appreciates PPV without significantly degrading sensitivity
Table 1

Quantitative comparison of sensitivity and PPV with and without using CAPP for a fixed value of area threshold=120. By employingCAPP, PPV is doubled on unseen data, without drastically reducing the sensitivity (i.e., less than 15% only)

Since the data consisting of candidate potential MCs, identified before CAPP was applied, was unbalanced (mitotic-29.1%, non-mitotic-70.9%) and therefore a balanced mix of mitotic and non-mitotic examples were randomly selected as training data. A total of 69.90% of data was used for training and remaining 30.10% for testing. Grid search was used to find optimal parameters for the Gaussian kernel of the in SVM. Figure 5 demonstrates efficacy of the proposed MCs detection algorithm.
Figure 5

Visual results of mitotic cells (MC) detection in a sample image: (a) Original image with ground truth marked MCs shown in yellow color; (b) Results of Tumor segmentation (as outlined in Section 2.2) where non-tumor areas are shown in a slightly darker contrast with blue boundaries; (c) Results of MC detection (in yellow color) without CAPP (Sensitivity = 0.87, positive predictive value [PPV] = 0.54) and (d) Results of MC detection (in yellow color) with CAPP (Sensitivity = 0.87, PPV = 0.87)

Visual results of mitotic cells (MC) detection in a sample image: (a) Original image with ground truth marked MCs shown in yellow color; (b) Results of Tumor segmentation (as outlined in Section 2.2) where non-tumor areas are shown in a slightly darker contrast with blue boundaries; (c) Results of MC detection (in yellow color) without CAPP (Sensitivity = 0.87, positive predictive value [PPV] = 0.54) and (d) Results of MC detection (in yellow color) with CAPP (Sensitivity = 0.87, PPV = 0.87) A higher penalty for misclassification in the SVM was set for mitotic class, since the original data was unbalanced. Table 1 provides details of the quantitative results obtained with a five-fold cross-validation. According to these results, more than 200% of PPV was enhanced at the cost of lesser than 15% reduction in sensitivity. Quantitative comparison of sensitivity and PPV with and without using CAPP for a fixed value of area threshold=120. By employingCAPP, PPV is doubled on unseen data, without drastically reducing the sensitivity (i.e., less than 15% only)

CONCLUSION

In this paper, we presented GGMM for detection of MCs in breast cancer histopathological images. In addition, we introduced CAPP as a tool to increase the PPV with a minimal loss in sensitivity. We evaluated the performance of the proposed detection algorithm in terms of sensitivity and PPV over a set of 35 breast histology images selected from 5 different tissue slides and showed that a reasonably high value of sensitivity can be retained although increasing the PPV. Our future work will aim at increasing the PPV further by modeling the spatial appearance of regions surrounding mitotic events.
  4 in total

1.  Automated mitosis detection of stem cell populations in phase-contrast microscopy images.

Authors:  Seungil Huh; Dai Fei Elmer Ker; Ryoma Bise; Mei Chen; Takeo Kanade
Journal:  IEEE Trans Med Imaging       Date:  2011-03       Impact factor: 10.048

2.  Multi-resolution graph-based analysis of histopathological whole slide images: application to mitotic cell extraction and visualization.

Authors:  Vincent Roullier; Olivier Lézoray; Vinh-Thong Ta; Abderrahim Elmoataz
Journal:  Comput Med Imaging Graph       Date:  2011-05-19       Impact factor: 4.790

3.  Pathological prognostic factors in breast cancer. I. The value of histological grade in breast cancer: experience from a large study with long-term follow-up.

Authors:  C W Elston; I O Ellis
Journal:  Histopathology       Date:  1991-11       Impact factor: 5.087

4.  HyMaP: A hybrid magnitude-phase approach to unsupervised segmentation of tumor areas in breast cancer histology images.

Authors:  Adnan M Khan; Hesham El-Daly; Emma Simmons; Nasir M Rajpoot
Journal:  J Pathol Inform       Date:  2013-03-30
  4 in total
  11 in total

1.  Quantitative analysis of nuclear shape in oral squamous cell carcinoma is useful for predicting the chemotherapeutic response.

Authors:  Maki Ogura; Yoichiro Yamamoto; Hitoshi Miyashita; Hiroyuki Kumamoto; Manabu Fukumoto
Journal:  Med Mol Morphol       Date:  2015-10-06       Impact factor: 2.309

Review 2.  Mapping spatial heterogeneity in the tumor microenvironment: a new era for digital pathology.

Authors:  Andreas Heindl; Sidra Nawaz; Yinyin Yuan
Journal:  Lab Invest       Date:  2015-01-19       Impact factor: 5.662

3.  A Novel CAD System for Mitosis detection Using Histopathology Slide Images.

Authors:  Ashkan Tashk; Mohammad Sadegh Helfroush; Habibollah Danyali; Mojgan Akbarzadeh
Journal:  J Med Signals Sens       Date:  2014-04

4.  Workflow for high-content, individual cell quantification of fluorescent markers from universal microscope data, supported by open source software.

Authors:  Simon R Stockwell; Sibylle Mittnacht
Journal:  J Vis Exp       Date:  2014-12-16       Impact factor: 1.355

5.  Image Montaging for Creating a Virtual Pathology Slide: An Innovative and Economical Tool to Obtain a Whole Slide Image.

Authors:  Spoorthi Ravi Banavar; Prashanthi Chippagiri; Rohit Pandurangappa; Saileela Annavajjula; Premalatha Bidadi Rajashekaraiah
Journal:  Anal Cell Pathol (Amst)       Date:  2016-09-22       Impact factor: 2.916

6.  HyMaP: A hybrid magnitude-phase approach to unsupervised segmentation of tumor areas in breast cancer histology images.

Authors:  Adnan M Khan; Hesham El-Daly; Emma Simmons; Nasir M Rajpoot
Journal:  J Pathol Inform       Date:  2013-03-30

7.  Seeking genetic signature of radiosensitivity--a novel method for data analysis in case of small sample sizes.

Authors:  Joanna Zyla; Paul Finnon; Robert Bulman; Simon Bouffler; Christophe Badie; Joanna Polanska
Journal:  Theor Biol Med Model       Date:  2014-05-07       Impact factor: 2.432

8.  Detection and tracking of overlapping cell nuclei for large scale mitosis analyses.

Authors:  Yingbo Li; France Rose; Florencia di Pietro; Xavier Morin; Auguste Genovesio
Journal:  BMC Bioinformatics       Date:  2016-04-26       Impact factor: 3.169

Review 9.  Computer-based image analysis in breast pathology.

Authors:  Ziba Gandomkar; Patrick C Brennan; Claudia Mello-Thoms
Journal:  J Pathol Inform       Date:  2016-10-21

10.  A novel computational method for automatic segmentation, quantification and comparative analysis of immunohistochemically labeled tissue sections.

Authors:  Elena Casiraghi; Veronica Huber; Marco Frasca; Mara Cossa; Matteo Tozzi; Licia Rivoltini; Biagio Eugenio Leone; Antonello Villa; Barbara Vergani
Journal:  BMC Bioinformatics       Date:  2018-10-15       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.