| Literature DB >> 34860316 |
R Rashmi1, Keerthana Prasad2, Chethana Babu K Udupa3.
Abstract
Breast cancer in women is the second most common cancer worldwide. Early detection of breast cancer can reduce the risk of human life. Non-invasive techniques such as mammograms and ultrasound imaging are popularly used to detect the tumour. However, histopathological analysis is necessary to determine the malignancy of the tumour as it analyses the image at the cellular level. Manual analysis of these slides is time consuming, tedious, subjective and are susceptible to human errors. Also, at times the interpretation of these images are inconsistent between laboratories. Hence, a Computer-Aided Diagnostic system that can act as a decision support system is need of the hour. Moreover, recent developments in computational power and memory capacity led to the application of computer tools and medical image processing techniques to process and analyze breast cancer histopathological images. This review paper summarizes various traditional and deep learning based methods developed to analyze breast cancer histopathological images. Initially, the characteristics of breast cancer histopathological images are discussed. A detailed discussion on the various potential regions of interest is presented which is crucial for the development of Computer-Aided Diagnostic systems. We summarize the recent trends and choices made during the selection of medical image processing techniques. Finally, a detailed discussion on the various challenges involved in the analysis of BCHI is presented along with the future scope.Entities:
Keywords: Breast cancer; Deep learning; H&E Stains; Histopathological images; Image classification; Image segmentation; Machine learning
Mesh:
Year: 2021 PMID: 34860316 PMCID: PMC8642363 DOI: 10.1007/s10916-021-01786-9
Source DB: PubMed Journal: J Med Syst ISSN: 0148-5598 Impact factor: 4.460
Fig. 1Histopathological types of breast cancer [10]
Fig. 2Microscopic patterns of benign breast tumor (a) Fibroadenoma (Intracanalicular pattern), (b) Fibroadenoma (Pericanalicular pattern) (c) Phyllodes tumor (d) Intraductal papilloma
Fig. 3Microscopic patterns of Noninvasive (In situ) carcinoma (a) Intraductal carcinoma (b) Lobular carcinoma
Fig. 4Microscopic patterns of Invasive carcinoma. (a) IDC (b) Invasive lobular carcinoma (c) Medullary carcinoma (d) Mucinous carcinoma (e) Papillary carcinoma (f) Tubular carcinoma (g) Adenoid cystic carcinoma (h) Secretory carcinoma (i) Inflammatory carcinoma (j) Carcinoma with metaplasia
Fig. 5Histopathological image challenges. Figure (a) shows an example of artefact, (b) shows an example of tissue folding, (c) shows an example of thick sectioning, (d) shows an example of air bubbles, (e) shows an example of thin sectioning and (f) shows an example of blurring
Overview of the publicly available BCHI datasets
| BreakHis [ | 7909 | 40X, 100X, 200X, 400X | Benign=2480, Malignant=5429 |
| 700*460 pixels | |||
| PNG format | |||
| IDC [ | 162 | 40X | 198,73=IDC negative, |
| 78=IDC positive patches from 162 slides | |||
| 1360*1024 pixels | |||
| tiff format | |||
| BACH [ | 430 | - | 400=Microscopy images (2048*1536 pixels)-image-wise label |
| 30= Whole-slide images (42113*62625 pixels)- pixel-wise label | |||
| tiff format- microscopy images | |||
| in.svs format- WSI | |||
| TUPAC-2016 [ | 821 | 40X | 500=training |
| 321=testing | |||
| Camelyon- 2016 [ | 400 | 40X, 10X, 1X | WSIs of sentinel lymph node of breast cancer |
| Camelyon- 2017 [ | 200 | 40X | WSIs of sentinel lymph node of breast cancer |
| MITOS-ATYPIA-14 [ | - | 20X,40X | 284 frames at 20X magnification, |
| 1136 frames at 40X magnifications | |||
| tiff format | |||
| Bioimaging 2015 [ | - | 200X | 249=training, 20=testing and 16 extended test datasets |
| 2048*1536 pixels | |||
| BreCaHAD [ | 162 | - | 1360*1024 pixels tiff format |
| Breast cancer | 151 | - | WSI images of breast cancer semantic segmentation [ |
| NuCLS [ | 151 | - | WSI images of breast cancer |
Fig. 6Sample images to demonstrate the colour shade and illumination variations
Fig. 7An example of (a) normal nuclei, (b) prominent nucleoli, (c) hyperchromatic nuclei (d) cancerous nuclei, (e) mitotic nuclei, (f) lymphocyte, (g) clustered nuclei and (h) overlapping nuclei
Summary of the approaches used for ROI segmentation in BCHI
| Segmentation method (Generally categorization) | Segmentation method (Particular categorization) | |||
|---|---|---|---|---|
| Threshold-based method | Adaptive thresholding [ | Cancer Nuclei | 24 H&E images | NA |
| Region-based method | Marker-controlled watershed-based [ | Nuclei | 39 images | PP=90%Sen = 83% |
| DC = 0.9 | ||||
| Watershed-based [ | Nuclei | 26 cells | F1-score=0.93 | |
| Graph-based clustering [ | Epithelial areas in WSIs | 75=benign | NA | |
| Density-based spatial clustering [ | Neoplastic epithelium | 75=DCIS | F1-score=0.88 | |
| K-means clustering [ | Nuclei | 100 H&E images | Mean Jaccard | |
| index = 0.84 | ||||
| Accuracy = 85% | ||||
| K-means clustering [ | Tubule | 10 H&E WSIs | Accuracy = 90% | |
| 29 H&E images | ||||
| Fusion-based method | Gradient driven voting mechanism + | Nuclei | 8 H&E WSI | Precision=93% |
| Markov Random | Recall=96% | |||
| Field loop backpropagation [ | DC=0.9 | |||
| Wavelet decomposition + multi-scale region-growing [ | Nuclei | 32=Normal cell 22=Cancer cell | Accuracy=91% | |
| Expectation–maximization (EM) driven geodesic active contour+ overlap resolution [ | Lymphocytes | 100 images | Sen=86% | |
| Clustering +watershed-based [ | Nuclei | 149 cells | Accuracy =87% | |
| AdaBoost+active counter [ | Nuclei | NA | Accuracy=95% | |
| Adaptive thresholding + Clustering [ | Nuclei | 24 H&E images | NA | |
| DL based | DNN=Pang Net, Fully Convolutional Net, Decon Net [ | Nuclei | 2754 annotated nuclei | Accuracy =95% Recall=90% IU =81% Precision=86% F1-score =80% |
| Stacked Sparse Autoencoder [ | Nuclei | 3500 nuclei from 500 images | F1-score=84%, Precision-Recall Curve=78% | |
| Encoder and decoder model [ | Tissue labels | 240 biopsy images | Accuracy=93% | |
| Mask R-CNN [ | Nuclei | 33 images of 512X512 | Precision=91% F1-score=0.86 | |
| Bending loss regularization network [ | Nuclei | 21000 nuclei (4 breasts) | DC = 0.81 | |
| DCNN +Encoder and decoder [ | Tissues | 12 breast cancer WSI | FWIoU= 95% | |
| CNN [ | IDC | 162 WSI slides | F-score =71% Accuracy= 84% | |
| DL based | CNN+ Active counter+ Adaptive ellipse fitting [ | Nuclei | 204WSIs | F1-score=80-85% AveP=74-82% |
| Residual-inception-channel attention U-net [ | Nuclei | TCGA dataset | F1-Score=0.82 | |
| Atrous spatial pyramid pooling U-net [ | Nuclei | NA | NA | |
| Conditional Generative adversarial network [ | Nuclei | NA | F1-Score=0.86 | |
| Transfer learning based-deep CNN [ | Mitosis cell | NA | F1-Score=73% Precision_recall=76% | |
| DCNN [ | Mitosis cell | 920 mitosis cells | Precision=0.84% Recall=0.83 F1-score=85.05 | |
| Others | Level set information [ | Nuclei | 18=Benign 36=Malignant | Accuracy=81% |
| Hybrid level set information [ | Nuclei | 4000 Nuclei | NA | |
| Color-based [ | NA | TCGD dataset | Accuracy=85% |
NA= Not available; PP=Positive Predictive; Sen= Sensitivity; DC=Dice Coefficient; FWIoU= Frequency Weighted Intersection over Union; AveP= Area under Precision recall curve
Fig. 8Illustration of segmentation methods used in literature
Fig. 9Illustration of various classifiers used in the literature
Summary of the state-of-the-art approaches
| NP | NP | Curvelet, LBP | SVM, Random forest, Decision tree, Polynomial classifiers | Acc=91% (Polynomial classifier) | [ | |
| Color deconvolution | NP | LBP | Random Decision Tree | Acc=84% | [ | |
| Macenko, Nonlinear transformation | Thresholding | Color, texture, Shape | SVM | F-score=88% | [ | |
| Non liner mapping | Hybrid active counter | Pixel, Object, semantic level | SVM | Acc=92% | [ | |
| Macenko | NP | Color, shape, Nuclear density | CNN, SVM | Sen=95% | [ | |
| Macenko | NP | CNN | FCN | Acc=87% | [ | |
| Gaussian Blur Filters | K-means, Watershed | Morphology, Geometric | Rule-based, Decision Tree | Acc=70-86% | [ | |
| Macenko | NP | VGG16 | FCN | Acc=94-97% | [ | |
| Color deconvolution | NP | VGGNet | Random forest, FCN | Sen=90%, Pre=87%, F1-score=88% | [ | |
| Macenko | NP | Inception network | Gradient Boosting Tree | Acc=91-95% [BreakHis] | [ | |
| Quantile normalization | Hybrid level set | CNN | SVM | Acc=90% | [ | |
| Macenko | NP | GoogleNet, VGGNet, ResNet | FCN | Acc=97% | [ | |
| Image rescaling | NP | VGG16, VGG19, Xception, ResNet50 | SVM, Logistic regression | Acc=83-93% | [ | |
| Macenko | Laplacian of Gaussian | AlexNet, ResNet-18, ResNet50, ResNet-101, GoogleNet | SVM | Acc=96%, Sen=97% | [ | |
| Color enhancement | NP | ResNet-50, DenseNet-121, ML-InceptionV3, ML-VGG16 | E-SVM | Acc=97% | [ | |
| NP | NP | ResNet50, DenseNet-161 | FCN | Acc=91% | [ |
NP=Not performed; ACC=Accuracy; Sen=Sensitivity; Pre=Precision
Fig. 10Illustration of the database used in the literature
Fig. 11Distribution of image samples for different categories of diseases in BreakHis dataset
Fig. 12Distribution of image samples for benign and malignant cases in BreakHis dataset
Fig. 13Description of various CNN architectures used for binary and multi-class classification