| Literature DB >> 26817459 |
Virginie Uhlmann1,2, Shantanu Singh3, Anne E Carpenter4.
Abstract
BACKGROUND: Automated classification using machine learning often relies on features derived from segmenting individual objects, which can be difficult to automate. WND-CHARM is a previously developed classification algorithm in which features are computed on the whole image, thereby avoiding the need for segmentation. The algorithm obtained encouraging results but requires considerable computational expertise to execute. Furthermore, some benchmark sets have been shown to be subject to confounding artifacts that overestimate classification accuracy.Entities:
Mesh:
Year: 2016 PMID: 26817459 PMCID: PMC4729047 DOI: 10.1186/s12859-016-0895-y
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1Comparison between WND-CHARM [10] and the CP-CHARM algorithm presented in this paper. The overall construction of the algorithm is retained, but individual operations have been modified
Composition of the CHARM-inspired CellProfiler feature vector. The groups and levels construction of CHARM was recreated, although the final measurement set is not identical
| High-contrast features | Polynomial decompositions | Pixel statistics | Textures | |||||
|---|---|---|---|---|---|---|---|---|
| Edge | Gabor | Image | Chebyshev | Chebyshev- | Moments ∗ (4) | Multiscale | Haralick | Tamura |
| statistics (4) | features ∗ (2) | statistics (15) | statistics (32) | Fourier | histogram ∗ (24) | textures ∗ (104) | textures ∗ (6) | |
| Statistics (32) | ||||||||
| Mean | Gabor features computed at four angles | Max ∗ | 32-bins histogram of the 400 coefficients of the Chebyshev transform of the image | Modulus of the complex coefficients of the Fourier transform of the Chebyshev transform of the image | Mean (1st) | 3-bins histogram | Statistics based on the co-occurence matrix of the image | Contrast |
| Max | Mean ∗ | Variance (2nd) | Coarseness | |||||
| Variance | Percent minimal ∗ | 5-bins histogram | Directionality | |||||
| Number of edge pixels | Skewness (3rd) | 3-bins histogram of coarseness | ||||||
| Percent maximal ∗ | 7-bins histogram | |||||||
| Kurtosis (4th) | ||||||||
| Variance ∗ | 9-bins histogram | |||||||
| Total intensity ∗ | ||||||||
| Mean intensity after thresholding | ||||||||
| Variance on thresholded image | ||||||||
| Number of pixels above threshold | ||||||||
*denotes features extracted on higher image levels, namely on the original image, on its Wavelet transform, on its Chebyshev transform, on its Fourier transform, on the Wavelet transform of its Fourier transform, and on the Chebyshev transform of its Fourier transform
Fig. 2Example of images from each of the tested BBBC datasets. BBBC013: (a) positive, (b) negative; BBBC014: (c) positive, (d) negative; BBBC015: (e) positive, (f) negative; BBBC016: (g) positive, (h) negative
Fig. 3Examples of elements of the four cell compartments classified in the HPA tissue dataset. (a) Cytoplasm, (b) nuclei, (c) connective tissue, and (d) background
Classification results on WND CHARM’s reference datasets and IICBU suite
| WND-CHARM1 | CP-CHARM2 | ||||
|---|---|---|---|---|---|
| Dataset | Median | Std Dev. | Median | Std Dev. | |
| WND-CHARM’s reference suite | AT&T | 0.97 | 0.02 | 0.98 | 5e-3 |
| Brodatz | 0.91 | 0.01 | 0.91 | 3e-3 | |
| CHO ∗ | 0.93 | 0.02 | 0.99 | 3e-3 | |
| COIL-20 | 1.00 | 1e-3 | 1.00 | 1e-3 | |
| HeLa ∗ | 0.87 | 0.09 | 0.84 | 4e-3 | |
| Pollen | 0.95 | 0.02 | 0.95 | 3e-3 | |
| 0.83 | 0.07 | 0.84 | 0.01 | ||
| IICBU suite | Binucleate ∗ | 1.00 | 0.01 | 0.95 | 0.02 |
| Liver Aging | 0.93 | 0.03 | 0.89 | 4e-3 | |
| Liver Gen. AL | 0.98 | 0.01 | 0.98 | 5e-3 | |
| Liver Gen. CR | 0.99 | 0.01 | 0.99 | 1e-3 | |
| Lymphoma | 0.79 | 0.04 | 0.66 | 0.01 | |
| RNAi ∗ | 0.78 | 0.04 | 0.73 | 0.02 | |
| Terminal Bulb | 0.59 | 0.04 | 0.55 | 6e-3 | |
Note: 1.0 corresponds to 100 % correct classification
Datasets marked with a star (∗) are likely to be subject to global image artifacts. See explanations in text
1Using lone 4-fold cross-validation
2Using 10-fold cross-validation
CP-CHARM classification results on additional biological datasets
| Dataset | Median | Standard deviation |
|---|---|---|
| BBBC013 | 0.99 | 0.01 |
| BBBC014 | 0.84 | 0.03 |
| BBBC015 | 0.99 | 8e-3 |
| BBBC016 | 0.81 | 0.07 |
| Tissue | 0.91 | 4e-3 |
Note: 1.0 corresponds to 100 % correct classification