| Literature DB >> 21261972 |
Lee K Jones1, Fei Zou, Alexander Kheifets, Konstantin Rybnikov, Damon Berry, Aik Choon Tan.
Abstract
BACKGROUND: Molecular classification of tumors can be achieved by global gene expression profiling. Most machine learning classification algorithms furnish global error rates for the entire population. A few algorithms provide an estimate of probability of malignancy for each queried patient but the degree of accuracy of these estimates is unknown. On the other hand local minimax learning provides such probability estimates with best finite sample bounds on expected mean squared error on an individual basis for each queried patient. This allows a significant percentage of the patients to be identified as confidently predictable, a condition that ensures that the machine learning algorithm possesses an error rate below the tolerable level when applied to the confidently predictable patients.Entities:
Mesh:
Year: 2011 PMID: 21261972 PMCID: PMC3038886 DOI: 10.1186/1755-8794-4-10
Source DB: PubMed Journal: BMC Med Genomics ISSN: 1755-8794 Impact factor: 3.063
Kernel method, sigma = 0.5 and 0.7, threshold, p = 0.35, adjustment, e = 0.00 and 0.05
| Data set | Sigma | Adjustment (e) | CP (%) | Error in CP (%) | Total Error (%) |
|---|---|---|---|---|---|
| 0.5 | 0 | 36 (50.0%) | 0 (0%) | 3 (4.17%) | |
| 0.05 | 25 (34.7%) | 0 (0%) | 3 (4.17%) | ||
| 0.7 | 0 | 43 (59.7%) | 0 (0%) | 3 (4.17%) | |
| 0.05 | 36 (50.0%) | 0 (0%) | 3 (4.17%) | ||
| 0.5 | 0 | 32 (36.4%) | 4 (12.5%) | 21 (23.9%) | |
| 0.05 | 30 (34.1%) | 3 (10.0%) | 21 (23.9%) | ||
| 0.7 | 0 | 34 (38.6%) | 4 (11.8%) | 22 (25.0%) | |
| 0.05 | 25 (28.4%) | 2 (8.00%) | 22 (25.0%) | ||
| 0.5 | 0 | 154 (55.0%) | 6 (3.90%) | 39 (13.9%) | |
| 0.05 | 134 (47.9%) | 3 (2.24%) | 39 (13.9%) | ||
| 0.7 | 0 | 160 (57.1%) | 6 (3.75%) | 42 (15.0%) | |
| 0.05 | 134 (47.9%) | 5 (3.73%) | 42 (15.0%) | ||
Figure 1The relationships between sigma, percentage of confidence predictable (CP) patients and the percentage of error in CP patients in the three microarray data sets.
Leave-one-out comparisons of the local minimax learning with 3-nearest neighbor predictor and the kernel predictors (sigma = 0.5 and 0.7, p = 0.35).
| 3-nearest neighbor predictor | Kernel predictor with sigma = 0.5, e = 0 | Kernel predictor with sigma = 0.7, e = 0 | |||||||
|---|---|---|---|---|---|---|---|---|---|
| Data Set | CP (%) | Error in CP. (%) | Total Error (%) | CP (%) | Error in CP (%) | Total Error (%) | CP (%) | Error in CP (%) | Total Error (%) |
| 71 | 3 | 3 | 36 | 3 | 43 | 3 | |||
| (98.6%) | (4.23%) | (4.17%) | (50.0%) | 0 (0%) | (4.17%) | (59.7%) | 0 (0%) | (4.17%) | |
| 54 | 11 | 21 | 32 | 4 | 21 | 34 | 4 | 22 | |
| (61.4%) | (20.4%) | (23.9%) | (36.4%) | (12.5%) | (23.9%) | (38.6%) | (11.8%) | (25%) | |
| 204 | 15 | 38 | 154 | 6 | 39 | 160 | 6 | 42 | |
| (72.9%) | (7.35%) | (13.6%) | (55.0%) | (3.9%) | (13.9%) | (57.1%) | (3.75%) | (15%) | |
Figure 2Estimated predicted probability and 90% confidence intervals (90% CI) for the four patients in GCM data set.