| Literature DB >> 27847810 |
Henry Joutsijoki1, Markus Haponen2, Jyrki Rasku1, Katriina Aalto-Setälä3, Martti Juhola1.
Abstract
The purpose of this paper is to examine how well the human induced pluripotent stem cell (hiPSC) colony images can be classified using error-correcting output codes (ECOC). Our image dataset includes hiPSC colony images from three classes (bad, semigood, and good) which makes our classification task a multiclass problem. ECOC is a general framework to model multiclass classification problems. We focus on four different coding designs of ECOC and apply to each one of them k-Nearest Neighbor (k-NN) searching, naïve Bayes, classification tree, and discriminant analysis variants classifiers. We use Scaled Invariant Feature Transformation (SIFT) based features in classification. The best accuracy (62.4%) is obtained with ternary complete ECOC coding design and k-NN classifier (standardized Euclidean distance measure and inverse weighting). The best result is comparable with our earlier research. The quality identification of hiPSC colony images is an essential problem to be solved before hiPSCs can be used in practice in large-scale. ECOC methods examined are promising techniques for solving this challenging problem.Entities:
Mesh:
Year: 2016 PMID: 27847810 PMCID: PMC5101360 DOI: 10.1155/2016/3025057
Source DB: PubMed Journal: Biomed Res Int Impact factor: 3.411
The coding matrix for one-vs-all (OVA) coding design in the three-class classification problem. In the coding matrix rows represent codewords for each class C , i = 1,2, 3. Columns represent individual classifiers f , i = 1,2, 3, and how classes are divided into positive and negative classes.
|
|
|
| |
|---|---|---|---|
|
| 1 | −1 | −1 |
|
| −1 | 1 | −1 |
|
| −1 | −1 | 1 |
The coding matrix for the one-vs-one (OVO) coding design in the three-class classification problem.
|
|
|
| |
|---|---|---|---|
|
| 1 | 1 | 0 |
|
| −1 | 0 | 1 |
|
| 0 | −1 | −1 |
The coding matrix for ordinal (ORD) coding design in the three-class classification problem.
|
|
| |
|---|---|---|
|
| −1 | −1 |
|
| 1 | −1 |
|
| 1 | 1 |
The coding matrix for ternary complete (TER) coding design in the three-class classification problem.
|
|
|
|
|
|
| |
|---|---|---|---|---|---|---|
|
| −1 | −1 | 0 | 1 | −1 | −1 |
|
| 1 | −1 | −1 | −1 | 0 | 1 |
|
| 0 | 1 | 1 | 1 | 1 | 1 |
Figure 1Example images on iPSC colonies from classes bad, semigood, and good. Images on the first row are from the class bad, the second row images are from the class semigood, and the third row indicates colonies from the class good. Images are scaled to have width and height of 1.5 in.
The results of discriminant analysis variants and classification tree method when different ECOC coding designs have been used. Different coding designs are abbreviated as follows: one-vs-all (OVA), one-vs-one (OVO), ordinal (ORD), and ternary complete (TER). Quadratic discriminant analysis could not be evaluated due to nonpositive definiteness of covariance matrix. True positive rates can be found from the parenthesis next to true positive result and accuracy from the last column of the table.
| Method/class | Bad | Good | Semigood | ACC |
|---|---|---|---|---|
| OVA-LDA | 17 (41.5%) | 34 (45.9%) | 16 (27.6%) | 38.7% |
| OVO-LDA | 22 (53.7%) | 27 (36.5%) | 20 (34.5%) | 39.9% |
| ORD-LDA | 17 (41.5%) | 24 (32.4%) | 18 (31.0%) | 34.1% |
| TER-LDA | 16 (39.0%) | 32 (43.2%) | 20 (34.5%) | 39.3% |
| OVA-diagLinear | 19 (46.3%) |
| 11 (19.0%) | 51.4% |
| OVO-diagLinear | 16 (39.0%) | 58 (78.4%) | 16 (27.6%) | 52.0% |
| ORD-diagLinear | 19 (46.3%) | 39 (52.7%) | 15 (25.9%) | 42.2% |
| TER-diagLinear | 17 (41.5%) | 58 (78.4%) | 15 (25.9%) | 52.0% |
| OVA-pseudoQuadratic |
| 0 (0.0%) | 0 (0.0%) | 23.7% |
| OVO-pseudoQuadratic | 9 (22.0%) | 35 (47.3%) | 28 (48.3%) | 41.6% |
| ORD-pseudoQuadratic | 41 (100.0%) | 0 (0.0%) | 0 (0.0%) | 23.7% |
| TER-pseudoQuadratic | 9 (22.0%) | 31 (41.9%) |
| 41.0% |
| OVA-classification tree | 17 (41.5%) | 39 (52.7%) | 15 (25.9%) | 41.0% |
| OVO-classification tree | 19 (46.3%) | 50 (67.6%) | 30 (51.7%) |
|
| ORD-classification tree | 13 (31.7%) | 48 (64.9%) | 17 (29.3%) | 45.1% |
| TER-classification tree | 16 (39.0%) | 48 (64.9%) | 23 (39.7%) | 50.3% |
The results of naïve Bayes variants together with different ECOC coding designs. Different coding designs are abbreviated as follows: one-vs-all (OVA), one-vs-one (OVO), ordinal (ORD), and ternary complete (TER). True positive rates can be found from the parenthesis next to true positive result and accuracy from the last column of the table.
| Method/class | Bad | Good | Semigood | ACC |
|---|---|---|---|---|
| OVA-naïve Bayes (normal distribution assumption) |
| 62 (83.8%) | 14 (24.1%) |
|
| OVO-naïve Bayes (normal distribution assumption) | 16 (39.0%) | 61 (82.4%) | 14 (24.1%) | 52.6% |
| ORD-naïve Bayes (normal distribution assumption) | 16 (39.0%) | 49 (66.2%) |
| 48.6% |
| TER-naïve Bayes (normal distribution assumption) | 16 (39.0%) | 61 (82.4%) | 14 (24.1%) | 52.6% |
| OVA-naïve Bayes (kernel smoothing density estimation and triangle kernel) | 13 (31.7%) | 59 (79.7%) | 9 (15.5%) | 46.8% |
| OVO-naïve Bayes (kernel smoothing density estimation and triangle kernel) | 13 (31.7%) | 59 (79.7%) | 12 (20.7%) | 48.6% |
| ORD-naïve Bayes (kernel smoothing density estimation and triangle kernel) | 13 (31.7%) | 58 (78.4%) | 10 (17.2%) | 46.8% |
| TER-naïve Bayes (kernel smoothing density estimation and triangle kernel) | 13 (31.7%) | 59 (79.7%) | 12 (20.7%) | 48.6% |
| OVA-naïve Bayes (kernel smoothing density estimation and Epanechnikov kernel) | 15 (36.6%) | 60 (81.1%) | 10 (17.2%) | 49.1% |
| OVO-naïve Bayes (kernel smoothing density estimation and Epanechnikov kernel) | 13 (31.7%) | 60 (81.1%) | 11 (19.0%) | 48.6% |
| ORD-naïve Bayes (kernel smoothing density estimation and Epanechnikov kernel) | 13 (31.7%) | 59 (79.7%) | 10 (17.2%) | 47.4% |
| TER-naïve Bayes (kernel smoothing density estimation and Epanechnikov kernel) | 15 (36.6%) | 59 (79.7%) | 11 (19.0%) | 49.1% |
| OVA-naïve Bayes (kernel smoothing density estimation and box kernel) | 13 (31.7%) | 64 (86.5%) | 9 (15.5%) | 49.7% |
| OVO-naïve Bayes (kernel smoothing density estimation and box kernel) | 12 (29.3%) |
| 11 (19.0%) | 50.9% |
| ORD-naïve Bayes (kernel smoothing density estimation and box kernel) | 13 (31.7%) | 63 (85.1%) | 9 (15.5%) | 49.1% |
| TER-naïve Bayes (kernel smoothing density estimation and box kernel) | 13 (31.7%) | 64 (86.5%) | 10 (17.2%) | 50.3% |
| OVA-naïve Bayes (kernel smoothing density estimation and Gaussian kernel) | 13 (31.7%) | 64 (86.5%) | 9 (15.5%) | 49.7% |
| OVO-naïve Bayes (kernel smoothing density estimation and Gaussian kernel) | 18 (43.9%) | 59 (79.7%) | 14 (24.1%) | 52.6% |
| ORD-naïve Bayes (kernel smoothing density estimation and Gaussian kernel) | 13 (31.7%) | 63 (85.1%) | 9 (15.5%) | 49.1% |
| TER-naïve Bayes (kernel smoothing density estimation and Gaussian kernel) | 13 (31.7%) | 64 (86.5%) | 10 (17.2%) | 50.3% |
The results of k-Nearest Neighbors searching method variants with one-vs-all coding design. True positive rates can be found from the parenthesis next to true positive result and accuracy from the last column of the table.
| Method/class | Bad | Good | Semigood | ACC |
|---|---|---|---|---|
| Chebyshev measure and equal weights | 27 (65.9%) | 45 (60.8%) | 22 (37.9%) | 54.3% |
| Chebyshev measure and inverse weights | 15 (36.6%) | 44 (59.5%) | 21 (36.2%) | 46.2% |
| Chebyshev measure and inverse squared weights | 16 (39.0%) | 55 (74.3%) | 23 (39.7%) | 54.3% |
| Cityblock measure and equal weighting |
| 51 (68.9%) | 23 (39.7%) | 59.0% |
| Cityblock measure and inverse weighting | 25 (61.0%) | 52 (70.3%) |
| 60.1% |
| Cityblock measure and squared inverse weighting | 24 (58.5%) | 50 (67.6%) |
| 58.4% |
| Correlation measure and equal weighting | 19 (46.3%) | 59 (79.7%) | 18 (31.0%) | 55.5% |
| Correlation measure and inverse weighting | 16 (39.0%) | 54 (73.0%) | 19 (32.8%) | 51.4% |
| Correlation measure and squared inverse weighting | 19 (46.3%) | 58 (78.4%) | 21 (36.2%) | 56.6% |
| Cosine measure and equal weighting | 24 (58.5%) | 52 (70.3%) | 19 (32.8%) | 54.9% |
| Cosine measure and inverse weighting | 20 (48.8%) | 59 (79.7%) | 20 (34.5%) | 57.2% |
| Cosine measure and squared inverse weighting | 20 (48.8%) |
| 21 (36.2%) | 59.5% |
| Euclidean measure and equal weighting | 25 (61.0%) | 51 (68.9%) | 23 (39.7%) | 57.2% |
| Euclidean measure and inverse weighting | 24 (58.5%) | 49 (66.2%) | 25 (43.1%) | 56.6% |
| Euclidean measure and squared inverse weighting | 21 (51.2%) | 48 (64.9%) | 24 (41.4%) | 53.8% |
| Standardized Euclidean measure and equal weighting |
| 54 (73.0%) | 25 (43.1%) |
|
| Standardized Euclidean measure and inverse weighting | 25 (61.0%) | 53 (71.6%) |
| 60.7% |
| Standardized Euclidean measure and squared inverse weighting | 20 (48.8%) | 46 (62.2%) | 26 (44.8%) | 53.2% |
| Spearman measure and equal weighting | 15 (36.6%) | 50 (67.6%) | 16 (27.6%) | 46.8% |
| Spearman measure and inverse weighting | 16 (39.0%) | 61 (82.4%) | 17 (29.3%) | 54.3% |
| Spearman measure and squared inverse weighting | 18 (43.9%) | 59 (79.7%) | 19 (32.8%) | 55.5% |
The results of k-Nearest Neighbors searching method variants with one-vs-one coding design. True positive rates can be found from the parenthesis and accuracy from the last column of the table.
| Method/class | Bad | Good | Semigood | ACC |
|---|---|---|---|---|
| Chebyshev measure and equal weights | 23 (56.1%) | 46 (62.2%) | 21 (36.2%) | 52.0% |
| Chebyshev measure and inverse weights | 23 (56.1%) | 46 (62.2%) | 21 (36.2%) | 52.0% |
| Chebyshev measure and inverse squared weights | 18 (43.9%) | 47 (63.5%) | 21 (36.2%) | 49.7% |
| Cityblock measure and equal weighting |
| 54 (73.0%) | 24 (41.4%) | 59.0% |
| Cityblock measure and inverse weighting |
| 54 (73.0%) | 23 (39.7%) | 58.4% |
| Cityblock measure and squared inverse weighting | 21 (51.2%) | 52 (70.3%) | 20 (34.5%) | 53.8% |
| Correlation measure and equal weighting | 17 (41.5%) | 54 (73.0%) | 18 (31.0%) | 51.4% |
| Correlation measure and inverse weighting | 15 (36.6%) | 56 (75.7%) | 16 (27.6%) | 50.3% |
| Correlation measure and squared inverse weighting | 20 (48.8%) | 60 (81.1%) | 22 (37.9%) | 59.0% |
| Cosine measure and equal weighting | 17 (41.5%) | 56 (75.7%) | 15 (25.9%) | 50.9% |
| Cosine measure and inverse weighting | 18 (43.9%) |
| 18 (31.0%) | 57.8% |
| Cosine measure and squared inverse weighting | 17 (41.5%) | 60 (81.1%) | 21 (36.2%) | 56.6% |
| Euclidean measure and equal weighting | 23 (56.1%) | 52 (70.3%) | 24 (41.4%) | 57.2% |
| Euclidean measure and inverse weighting | 23 (56.1%) | 52 (70.3%) | 24 (41.4%) | 57.2% |
| Euclidean measure and squared inverse weighting |
| 51 (68.9%) | 28 (48.3%) | 59.5% |
| Standardized Euclidean measure and equal weighting | 23 (56.1%) | 58 (78.4%) | 23 (39.7%) | 60.1% |
| Standardized Euclidean measure and inverse weighting | 23 (56.1%) | 58 (78.4%) | 23 (39.7%) | 60.1% |
| Standardized Euclidean measure and squared inverse weighting |
| 52 (70.3%) |
|
|
| Spearman measure and equal weighting | 16 (39.0%) | 59 (79.7%) | 17 (29.3%) | 53.2% |
| Spearman measure and inverse weighting | 18 (43.9%) | 61 (82.4%) | 19 (32.8%) | 56.6% |
| Spearman measure and squared inverse weighting | 18 (43.9%) | 62 (83.8%) | 21 (36.2%) | 58.4% |
The results of k-Nearest Neighbors searching method variants ordinal coding design. True positive rates can be found from the parenthesis next to true positive result and accuracy from the last column of the table.
| Method/class | Bad | Good | Semigood | ACC |
|---|---|---|---|---|
| Chebyshev measure and equal weights | 21 (51.2%) | 56 (75.7%) | 25 (43.1%) | 59.0% |
| Chebyshev measure and inverse weights | 21 (51.2%) | 56 (75.7%) | 25 (43.1%) | 59.0% |
| Chebyshev measure and squared inverse weights | 21 (51.2%) | 55 (74.3%) | 19 (32.8%) | 54.9% |
| Cityblock measure and equal weighting |
| 58 (78.4%) | 23 (39.7%) | 60.1% |
| Cityblock measure and inverse weighting |
| 58 (78.4%) | 23 (39.7%) | 60.1% |
| Cityblock measure and squared inverse weighting | 23 (56.1%) | 56 (75.7%) | 24 (41.4%) | 59.5% |
| Correlation measure and equal weighting | 16 (39.0%) | 45 (60.8%) | 15 (25.9%) | 43.9% |
| Correlation measure and inverse weighting | 16 (39.0%) | 45 (60.8%) | 15 (25.9%) | 43.9% |
| Correlation measure and squared inverse weighting | 16 (39.0%) | 46 (62.2%) | 16 (27.6%) | 45.1% |
| Cosine measure and equal weighting | 14 (34.1%) | 60 (81.1%) | 15 (25.9%) | 51.4% |
| Cosine measure and inverse weighting | 16 (39.0%) | 63 (85.1%) | 16 (27.6%) | 54.9% |
| Cosine measure and squared inverse weighting | 16 (39.0%) | 55 (74.3%) | 16 (27.6%) | 50.3% |
| Euclidean measure and equal weighting | 16 (39.0%) | 55 (74.3%) | 19 (32.8%) | 52.0% |
| Euclidean measure and inverse weighting | 16 (39.0%) | 54 (73.0%) | 20 (34.5%) | 52.0% |
| Euclidean measure and squared inverse weighting | 18 (43.9%) | 54 (73.0%) | 18 (31.0%) | 52.0% |
| Standardized Euclidean measure and equal weighting | 21 (51.2%) |
| 22 (37.9%) |
|
| Standardized Euclidean measure and inverse weighting | 21 (51.2%) | 63 (85.1%) | 22 (37.9%) | 61.3% |
| Standardized Euclidean measure and squared inverse weighting | 22 (53.7%) | 55 (74.3%) |
| 59.5% |
| Spearman measure and equal weighting | 14 (34.1%) | 57 (77.0%) | 14 (24.1%) | 49.1% |
| Spearman measure and inverse weighting | 13 (31.7%) | 63 (85.1%) | 14 (24.1%) | 52.0% |
| Spearman measure and squared inverse weighting | 13 (31.7%) | 62 (83.8%) | 14 (24.1%) | 51.4% |
The results of k-Nearest Neighbors searching method variants with ternary complete coding design. True positive rates can be found from the parenthesis next to true positive result and accuracy from the last column of the table.
| Method/class | Bad | Good | Semigood | ACC |
|---|---|---|---|---|
| Chebyshev measure and equal weights | 24 (58.5%) | 48 (64.9%) | 26 (44.8%) | 56.6% |
| Chebyshev measure and inverse weights | 22 (53.7%) | 46 (62.2%) | 24 (41.4%) | 53.2% |
| Chebyshev measure and squared inverse weights | 16 (39.0%) | 54 (73.0%) | 22 (37.9%) | 53.2% |
| Cityblock measure and equal weighting | 24 (58.5%) | 54 (73.0%) | 24 (41.4%) | 59.0% |
| Cityblock measure and inverse weighting | 24 (58.5%) | 54 (73.0%) | 25 (43.1%) | 59.5% |
| Cityblock measure and squared inverse weighting | 24 (58.5%) | 53 (71.6%) | 22 (37.9%) | 57.2% |
| Correlation measure and equal weighting | 17 (41.5%) | 56 (75.7%) | 19 (32.8%) | 53.2% |
| Correlation measure and inverse weighting | 17 (41.5%) | 59 (79.7%) | 22 (37.9%) | 56.6% |
| Correlation measure and squared inverse weighting | 19 (46.3%) | 57 (77.0%) | 22 (37.9%) | 56.6% |
| Cosine measure and equal weighting | 17 (41.5%) | 59 (79.7%) | 17 (29.3%) | 53.8% |
| Cosine measure and inverse weighting | 14 (34.1%) | 56 (75.7%) | 17 (29.3%) | 50.3% |
| Cosine measure and squared inverse weighting | 19 (46.3%) |
| 21 (36.2%) | 58.4% |
| Euclidean measure and equal weighting | 23 (56.1%) | 52 (70.3%) | 22 (37.9%) | 56.1% |
| Euclidean measure and inverse weighting | 24 (58.5%) | 51 (68.9%) |
| 59.5% |
| Euclidean measure and squared inverse weighting | 22 (53.7%) | 50 (67.6%) | 25 (43.1%) | 56.1% |
| Standardized Euclidean measure and equal weighting |
| 58 (78.4%) | 24 (41.4%) | 61.8% |
| Standardized Euclidean measure and inverse weighting |
| 58 (78.4%) | 25 (43.1%) |
|
| Standardized Euclidean measure and squared inverse weighting | 24 (58.5%) | 52 (70.3%) | 27 (46.6%) | 59.5% |
| Spearman measure and equal weighting | 14 (34.1%) | 56 (75.7%) | 16 (27.6%) | 49.7% |
| Spearman measure and inverse weighting | 18 (43.9%) |
| 16 (27.6%) | 54.9% |
| Spearman measure and squared inverse weighting | 19 (46.3%) |
| 20 (34.5%) | 57.8% |