| Literature DB >> 26513238 |
M Milagro Fernández-Carrobles1, Gloria Bueno1, Oscar Déniz1, Jesús Salido1, Marcial García-Rojo2, Lucía González-López3.
Abstract
Breast cancer diagnosis is still done by observation of biopsies under the microscope. The development of automated methods for breast TMA classification would reduce diagnostic time. This paper is a step towards the solution for this problem and shows a complete study of breast TMA classification based on colour models and texture descriptors. The TMA images were divided into four classes: i) benign stromal tissue with cellularity, ii) adipose tissue, iii) benign and benign anomalous structures, and iv) ductal and lobular carcinomas. A relevant set of features was obtained on eight different colour models from first and second order Haralick statistical descriptors obtained from the intensity image, Fourier, Wavelets, Multiresolution Gabor, M-LBP and textons descriptors. Furthermore, four types of classification experiments were performed using six different classifiers: (1) classification per colour model individually, (2) classification by combination of colour models, (3) classification by combination of colour models and descriptors, and (4) classification by combination of colour models and descriptors with a previous feature set reduction. The best result shows an average of 99.05% accuracy and 98.34% positive predictive value. These results have been obtained by means of a bagging tree classifier with combination of six colour models and the use of 1719 non-correlated (correlation threshold of 97%) textural features based on Statistical, M-LBP, Gabor and Spatial textons descriptors.Entities:
Mesh:
Year: 2015 PMID: 26513238 PMCID: PMC4626403 DOI: 10.1371/journal.pone.0141556
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Breast TMA.
Fig 2Benign structures and benign anomalous structures in TMA images stained with HE.
A) Terminal ducts and lobules, B) Sclerosing lesions (radial scar), C) Adenosis lesions, D) Fibroadenomas, E) Tubular adenomas, F) Phyllodes tumors, G) Columnar cell lesions and F) Duct ectasia.
Types of models and descriptors: morphological, textural and by colour.
| Model | Type of descriptor |
|---|---|
| Morphological | Shape, roundness, area, perimeter |
| Geometrical | Voronoi Diagrams |
| Structural | |
| Statistical | Co-occurrence matrix (GLCM)—1 |
| GL run-length matrices (GLRLMS) | |
| Higher order—Moments | |
| Model-Based | Markov random fields |
| Fractals | |
| Space-Frequential | Fourier |
| Wavelets, Gabor | |
| Transformed Space | Textons |
| LBP (Local Binary Patterns) | |
| SIFT (Scale-invariant feature transform) | |
| HOG (Histogram of oriented gradients) | |
| Colour | RGB, HSV, Lab, CIE-XYZ, Luv |
Classification performance of different state-of-the-art methods in digital pathology.
| Author | Feature Descriptor | Classifier | Num. Classes | BBDD Properties | Results |
|---|---|---|---|---|---|
|
| |||||
| Yang [ | Textons | Adaboost | 2 | 300 WSI, 45 TMAs, Trestle MedMicro, 40x, RGB | 89.00% ACC |
| Qi [ | LBP, 2 | AdaBoost with Linear perceptron least-square | 2 | 92 TMAs, 10x, RGB, Multispectral | 88% accuracy |
| Amaral [ | Gaussian filters | NN | 4 | 344 cores, RGB | 75.00% ACC |
| Le [ | Quadrature mirror filter (QMF) | SVM | 4 | 520 cores, RGB | 80.42% ACC |
| Xing [ | Textons | Adaboost | 4 | 547 ROIs, RGB | 88.00% ACC |
| Fernández-Carrobles [ | Textons | AdaBoost, Bagging Trees | 4 | 628 ROIs, 10x, Aperio ScanScope RGB, CMYK, HSV, Lab, Luv SCT, Hb, Lb | 98.1% ACC |
| Proposed Method | Fourier, Wavelets, Gabor, M-LBP, Textons | Fisher, SVM Random Forest, Bagging Trees, AdaBoost | 4 | 628 ROIs, 10x, Aperio ScanScope ALIAS II RGB, CMYK, HSV, Lab, Luv SCT, Hb, Lb | 99.05% ACC |
|
| |||||
| Fuchs [ | LBP | Random Forest | 2 | 133 cores, Nanozoomer C9600 40x, RGB | 0.026 p-value |
|
| |||||
| Ahonen [ | LBP | SVM | 2 | 1296 ROIs, Mirax Scan, 20x, RGB | 99.5% ACC |
|
| |||||
| Niwas [ | logGabor | SVM | 2 | 610 ROIs, Aperio ScanScope, 20x, HSI | 98.6% ACC |
| Chekkoury [ | Textons | SVM | 2 | 100 ROIs, 40x, CMY | 87.00% ACC |
| Zhang [ | CLBP, 2 | SVM, Multi-Layer Perceptron | 3 | 361 ROIs, Nikon Eclipse E600, 40x, RGB | 99.25% ACC |
| Bahlmann [ | 1 | SVM | 2 | DMetrix, 40x, RGB | 98.6% ACC |
|
| |||||
| Farjam [ | Roundness, shape, Haralick Wavelets | Linear | 5 | 290 ROIs, RGB | 90% ACC |
| Doyle [ | Haralick, Gabor | AdaBoost Cascade | 2 | 22 ROIs (3 scales), 40x, HSV | 88% ACC |
| Doyle [ | Architectural, Morphological, Haralick, Gabor | SVM | 4 | 54 ROIs, 40x, RGB | 89.36% ACC |
| Huang [ | Multiwavelet, Gabor, 2 | Bayesian, K-NN, SVM | 5 | 205 ROIs, RGB | 94.7% ACC |
| Khurd [ | Textons | SVM | 2 | 75 ROIs, 10x, RGB | 93.70% ACC |
| Monaco [ | Area, homogeneity size | Probabilistic pairwise Markov models | 2 | 40 WSI, Aperio, 10x, Lab | 87% sensitivity |
| Xu [ | Diffeomorphic filters | SVM | 4 | 23 WSI, 105 images, Aperio, 20x, RGB, HSV | 82.5% ACC |
| DiFranco [ | 2 | Random Forest, SVM | 2 | 15 WSI, Aperio XT 40x, RGB, Lab | 95% AUC |
| Doyle [ | Haralick Gabor | Boosted Bayesian | 2 | 100 WSI, Aperio ScanScope 40x, HSI | 81% AUC |
|
| |||||
| Mete [ | Clustering | SVM | 2 | 7 WSI, 20x, RGB | 96% ACC |
|
| |||||
| Lessmann [ | colour transforms, Wavelets | Self Organizing Map | 4 | 1280 ROIs, Zeiss Axioskop 2 Plus, RGB | 79% ACC |
|
| |||||
| Kong [ | Haralick, | KNN, SVM, Bayesian LDA | 3 | 33 WSI Aperio ScanScope T2 40x, RGB, Lab | 87.88% ACC |
|
| |||||
| Nateghi [ | Haralick, GLRLMS, moments, CLBP, Wavelets, Gabor | SVM | 2 | 35 WSI, Aperio XT, 40x, RGB | 77.34% F-measure |
| Tashk [ | LBP | SVM | 2 | 5 WSI, Aperio XT, Hamamatsu 40x, RGB | 70% F-measure |
Fig 3Colour models and combinations.
A) RGB, B) CMYK, C) HSV, D) Lab, E) Luv, F) SCT, G) Lbb, H) Hbb.
1 order statistical descriptors.
| Statistical | Formula |
|---|---|
| Mean |
|
| Mode |
|
| Variance |
|
| 1 |
|
|
| |
| 2 |
|
|
| |
| 3 |
|
|
| |
| Interquartile range | 3 |
| Minimum |
|
| Maximum |
|
| Range |
|
| Entropy |
|
| Asymmetry |
|
| Kurtosis |
|
2 order statistical descriptors.
| Statistical | Formula |
|---|---|
| Energy |
|
| Contrast |
|
| Correlation |
|
| Variance |
|
| Sum average |
|
| Sum entropy |
|
| Sum variance |
|
| Homogeneity 1 |
|
| Entropy |
|
| Difference variance |
|
| Difference entropy |
|
| Measure of correlation 1 |
|
| Measure of correlation 2 |
|
| Homogeneity 2 |
|
| Cluster Shade |
|
| Cluster Prominence |
|
| Autocorrelation |
|
| Dissimilarity |
|
| Maximum probability |
|
| When: | |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
Fig 4Process to extract the Fourier filtered images.
Fig 5Filtered wavelet images obtained by adding the three detail images.
Fig 6Feature extraction process in a TMA with RGB colour model.
Bagging Algorithm [40].
| Training phase |
| 1. Initialize the parameters |
| - |
| -L, the number of classifiers to train |
| 2. For k = 1, …, L |
| -Take a bootstrap sample |
| -Build a classifier |
| -Add the classifier to the current ensemble, D = D U |
| 3. Return D. |
| Classification phase |
| 4. Run |
| 5. The class with the maximum number of votes is chosen as the label for x. |
AdaBoost Algorithm [40].
| Training phase |
| 1. Initialize the parameters |
| -Set the weights |
| -Initialize the ensemble |
| -L, the number of classifiers to train |
| 2. For k = 1, …, L |
| -Take a sample |
| -Build a classifier |
| -Calculate the weighted ensemble error at step k by |
|
|
| ( |
| -If |
| -Else, calculate |
|
|
| -Update the individual weights |
|
|
| 3. Return D and |
| Classification phase |
| 4. Calculate the support for class |
|
|
| 5. The class with the maximum support is chosen as the label for x. |
Fig 7Results obtained using colour models independently.
A) Average error of all classifiers and descriptors, B) ROC curves for Classification using AdaBoost and Bagging with intensity and Hb colour model.
Fig 8Results obtained using colour model combinations.
A) Average error of all classifiers and descriptors, B) ROC curves for Classification using AdaBoost and Bagging with intensity and CMYK&Hb&Lb&HSV&Lab colour model combination.
Fig 9ROC curves for Classification using AdaBoost and Bagging.
AdaBoost results with Intensity&M-LBP&Gabor and Hb&Luv&SCT colour model combination. Bagging results with Intensity&M-LBP&Gabor&S-Textons and CMYK&Hb&Lb&HSV&Luv&SCT colour model combination.
Best classification using a Bagging classifier and a combination of CMYK&Hb&Lb&HSV&Lab colour models and Intensity&M-LBP&Gabor&S-Textons descriptors.
| Label | E1 | E2 | E3 | E4 | PPV | NPV | Sensitivity | Specificity | ACC |
|---|---|---|---|---|---|---|---|---|---|
| 1 | 168 | 0 | 1 | 1 | 98.8 | 97.4 | 93.3 | 99 | 97.77 |
| 2 | 0 | 102 | 0 | 1 | 99 | 99.8 | 99 | 99.8 | 99.68 |
| 3 | 5 | 0 | 157 | 1 | 96.3 | 98.9 | 96.9 | 98.7 | 98 |
| 4 | 7 | 1 | 4 | 180 | 93.7 | 99.3 | 98 | 97.3 | 97.6 |
Fig 10Results using the Bagging classifier with colour model and feature combinations and a previous correlation analysis using a threshold of 97%.
Where: (1) Intensity&M-LBP, (2) Intensity&S-Textons, (3) Intensity&M-LBP&Gabor, (4) Intensity&M-LBP&S-Textons, (5) Intensity&M-LBP&Gabor&S-Textons and (6) Intensity&M-LBP&Gabor&Wavelets.
The best final classification was obtained by a previous correlation threshold of 97% and the Bagging classifier combining CMYK&Hb&Lb&HSV&Luv&SCT colour model and Intensity&M-LBP&Gabor&S-Textons descriptors.
| Label | Total | E1 | E2 | E3 | E4 | PPV | NPV | Sensitivity | Specificity | ACC |
|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 170 | 166 | 0 | 1 | 3 | 97.65 | 98.47 | 95.9 | 99.12 | 98.25 |
| 2 | 103 | 0 | 103 | 0 | 0 | 100 | 100 | 100 | 100 | 100 |
| 3 | 163 | 1 | 0 | 162 | 0 | 99.38 | 99.57 | 98.78 | 99.78 | 99.52 |
| 4 | 192 | 6 | 0 | 1 | 185 | 96.35 | 99.31 | 98.4 | 98.41 | 98.41 |
Fig 11ROC curves for the best result in Experiment 4 (Non correlated features at 97%).
Fig 12Results obtained from the Bagging classifier using colour models independently.
Fig 13Results obtained from the AdaBoost classifier using colour models independently.
Fig 14Results obtained from the Bagging classifier using colour model combination.
Fig 15Results obtained from the AdaBoost classifier using colour model combination.
Fig 16Results using the Bagging classifier with a combination of colour models and descriptors.
Where: (1) Intensity&M-LBP, (2) Intensity&S-Textons, (3) Intensity&M-LBP&Gabor, (4) Intensity&M-LBP&S-Textons, (5) Intensity&M-LBP&Gabor&S-Textons and (6) Intensity&M-LBP&Gabor&Wavelets.
Fig 17Results using the AdaBoost classifier with a combination of colour models and descriptors.
Where: (1) Intensity&M-LBP, (2) Intensity&S-Textons, (3) Intensity&M-LBP&Gabor, (4) Intensity&M-LBP&S-Textons, (5) Intensity&M-LBP&Gabor&S-Textons and (6) Intensity&M-LBP&Gabor&Wavelets.
Classification using a combination of colour models and descriptors and a 97% correlation threshold.
| Bagging | AdaBoost | ||
|---|---|---|---|
| Hb&Luv&SCT | Intensity&M-LBP | 0.041 | 0.051 |
| Intensity&S-Textons | 0.043 | 0.059 | |
| Intensity&M-LBP&Gabor | 0.024 | 0.033 | |
| Intensity&M-LBP&S-Textons | 0.043 | 0.056 | |
| Intensity&M-LBP&Gabor&S-Textons | 0.028 | 0.041 | |
| Intensity&M-LBP&Gabor&Wavelets | 0.022 | 0.048 | |
| CMYK&Hb&Lb& HSV&Lab | Intensity&M-LBP | 0.048 | 0.057 |
| Intensity&S-Textons | 0.049 | 0.06 | |
| Intensity&M-LBP&Gabor | 0.028 | 0.051 | |
| Intensity&M-LBP&S-Textons | 0.046 | 0.057 | |
| Intensity&M-LBP&Gabor&S-Textons | 0.027 | 0.054 | |
| Intensity&M-LBP&Gabor&Wavelets | 0.028 | 0.047 | |
| CMYK&Hb&Lb&HSV&Luv&SCT | Intensity&M-LBP | 0.052 | 0.054 |
| Intensity&S-Textons | 0.051 | 0.05 | |
| Intensity&M-LBP&Gabor | 0.023 | 0.051 | |
| Intensity&M-LBP&S-Textons | 0.05 | 0.06 | |
| Intensity&M-LBP&Gabor&S-Textons | 0.019 | 0.05 | |
| Intensity&M-LBP&Gabor&Wavelets | 0.033 | 0.059 | |
| RGB&Hb&Lb&HSV&Luv&SCT | Intensity&M-LBP | 0.052 | 0.062 |
| Intensity&S-Textons | 0.05 | 0.063 | |
| Intensity&M-LBP&Gabor | 0.03 | 0.056 | |
| Intensity&M-LBP&S-Textons | 0.046 | 0.063 | |
| Intensity&M-LBP&Gabor&S-Textons | 0.028 | 0.054 | |
| Intensity&M-LBP&Gabor&Wavelets | 0.033 | 0.054 | |
| All colour models | Intensity&M-LBP | 0.048 | 0.059 |
| Intensity&S-Textons | 0.05 | 0.054 | |
| Intensity&M-LBP&Gabor | 0.027 | 0.052 | |
| Intensity&M-LBP&S-Textons | 0.044 | 0.062 | |
| Intensity&M-LBP&Gabor&S-Textons | 0.022 | 0.05 | |
| Intensity&M-LBP&Gabor&Wavelets | 0.033 | 0.052 |
Classification using a combination of colour models and descriptors and a 99% correlation threshold.
| Bagging | AdaBoost | ||
|---|---|---|---|
| Hb&Luv&SCT | Intensity&M-LBP | 0.044 | 0.048 |
| Intensity&S-Textons | 0.043 | 0.05 | |
| Intensity&M-LBP&Gabor | 0.035 | 0.043 | |
| Intensity&M-LBP&S-Textons | 0.04 | 0.052 | |
| Intensity&M-LBP&Gabor&S-Textons | 0.027 | 0.06 | |
| Intensity&M-LBP&Gabor&Wavelets | 0.025 | 0.044 | |
| CMYK&Hb&Lb& HSV&Lab | Intensity&M-LBP | 0.052 | 0.067 |
| Intensity&S-Textons | 0.057 | 0.054 | |
| Intensity&M-LBP&Gabor | 0.028 | 0.05 | |
| Intensity&M-LBP&S-Textons | 0.059 | 0.063 | |
| Intensity&M-LBP&Gabor&S-Textons | 0.033 | 0.056 | |
| Intensity&M-LBP&Gabor&Wavelets | 0.028 | 0.048 | |
| CMYK&Hb&Lb&HSV&Luv&SCT | Intensity&M-LBP | 0.048 | 0.059 |
| Intensity&S-Textons | 0.05 | 0.065 | |
| Intensity&M-LBP&Gabor | 0.033 | 0.057 | |
| Intensity&M-LBP&S-Textons | 0.049 | 0.062 | |
| Intensity&M-LBP&Gabor&S-Textons | 0.028 | 0.065 | |
| Intensity&M-LBP&Gabor&Wavelets | 0.036 | 0.059 | |
| RGB&Hb&Lb&HSV&Luv&SCT | Intensity&M-LBP | 0.051 | 0.067 |
| Intensity&S-Textons | 0.056 | 0.062 | |
| Intensity&M-LBP&Gabor | 0.041 | 0.05 | |
| Intensity&M-LBP&S-Textons | 0.044 | 0.062 | |
| Intensity&M-LBP&Gabor&S-Textons | 0.028 | 0.054 | |
| Intensity&M-LBP&Gabor&Wavelets | 0.043 | 0.057 | |
| All colour models | Intensity&M-LBP | 0.049 | 0.063 |
| Intensity&S-Textons | 0.054 | 0.056 | |
| Intensity&M-LBP&Gabor | 0.04 | 0.052 | |
| Intensity&M-LBP&S-Textons | 0.043 | 0.067 | |
| Intensity&M-LBP&Gabor&S-Textons | 0.036 | 0.057 | |
| Intensity&M-LBP&Gabor&Wavelets | 0.042 | 0.056 |
Fig 18Results using the best classifier with the best combination of colour models and descriptors with and without feature selection.
Where: (1) Intensity&M-LBP, (2) Intensity&S-Textons, (3) Intensity&M-LBP&Gabor, (4) Intensity&M-LBP&S-Textons, (5) Intensity&M-LBP&Gabor&S-Textons and (6) Intensity&M-LBP&Gabor&Wavelets. A) Bagging Tree Classifier, B) AdaBoost Classifier.