| Literature DB >> 15555060 |
Abstract
BACKGROUND: Recently, mass spectrometry data have been mined using a genetic algorithm to produce discriminatory models that distinguish healthy individuals from those with cancer. This algorithm is the basis for claims of 100% sensitivity and specificity in two related publicly available datasets. To date, no detailed attempts have been made to explore the properties of this genetic algorithm within proteomic applications. Here the algorithm's performance on these datasets is evaluated relative to other methods.Entities:
Mesh:
Year: 2004 PMID: 15555060 PMCID: PMC539275 DOI: 10.1186/1471-2105-5-180
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Test set classification of DS1 via 7 markers
Figure 2Test set classification of DS1 via 10 markers
Variation in results of GA applied 10 times
| Generations | Test Set Errors | Clusters | Markers | Primary |
| 7 | 6(5%) | 7 | 10 | 831.1 |
| 14 | 10(8%) | 24 | 16 | 617.7 |
| 9 | 2(2%) | 8 | 19 | 246.7 |
| 6 | 4(3%) | 21 | 7 | 632.6 |
| 13 | 8(6%) | 13 | 14 | 226.9 |
| 5 | 9(7%) | 7 | 20 | 42.6 |
| 9 | 4(3%) | 5 | 11 | 831.1 |
| 9 | 7(6%) | 30 | 10 | 617.7 |
| 9 | 1(1%) | 7 | 11 | 786.5 |
| 7 | 1(1%) | 5 | 17 | 42.6 |
Accuracy percentiles and characteristics of models from 50 cross-validation samples of DS1, m/z > 0
| Algorithm | Test Set Accuracy 25 | Median # Clusters | Median # Markers | Proportion of Perfect Chromosomes |
| GA( | .96, .98 | 12 | 15.5 | 1.0 |
| GA(0, .001) | .95, .99 | 12.5 | 11 | 1.0 |
| GA(0, .002) | .95, .98 | 12.5 | 10 | 1.0 |
| GA(.002, 0) | .96, .98 | 7 | 16 | .98 |
| GA(.002, .001) | .97, .98 | 7 | 12.5 | 1.0 |
| GA(.002, .002) | .96, .99 | 8.5 | 9 | .98 |
| GA(.005, 0) | .96, .98 | 6 | 15 | .92 |
| GA(.005, .001) | .96, .98 | 5 | 11 | .92 |
| GA(.005, .002) | .96, .99 | 6 | 8 | .94 |
| GA(.008, 0) | .96, .98 | 3 | 15 | .72 |
| GA(.008, .001) | .97, .99 | 4 | 9 | .88 |
| GA(.008, .002) | .97, .99 | 4 | 8 | .84 |
| Best GA | .97, .99 | 6 | 12 | .94 |
| Boosting | .99, 1.0 | NA | NA | NA |
| PAM | .93, .97 | NA | NA | NA |
Accuracy percentiles and characteristics of models from 50 cross-validation samples of DS1, m/z > 1500
| Algorithm | Test Set Accuracy 25 | Median # Clusters | Median # Markers | Proportion of Perfect Chromosomes |
| GA( | .83, .87 | 90 | 11 | 1.0 |
| GA(0, .001) | .80, .87 | 93 | 11 | 1.0 |
| GA(0, .002) | .82, .87 | 92.5 | 9 | 1.0 |
| GA(.002, 0) | .87, .92 | 12 | 19 | .26 |
| GA(.002, .001) | .90, .93 | 14 | 13 | .38 |
| GA(.002, .002) | .88, .92 | 13 | 7.5 | .20 |
| GA(.005, 0) | .87, .92 | 5 | 20 | .02 |
| GA(.005, .001) | .87, .90 | 5 | 10 | .02 |
| GA(.005, .002) | .87, .92 | 5 | 6.5 | 0 |
| GA(.008, 0) | .85, .90 | 3.5 | 18.5 | 0 |
| GA(.008, .001) | .85, .89 | 4 | 8.5 | 0 |
| GA(.008, .002) | .85, .90 | 4 | 6 | 0 |
| Best GA | .87, .91 | 7 | 10 | .12 |
| Boosting | .93, .97 | NA | NA | NA |
| PAM | .80, .84 | NA | NA | NA |
Accuracy percentiles and characteristics of models from 50 cross-validation samples of DS2, 700
| Algorithm | Test Set Accuracy 25 | Median # Clusters | Median # Markers | Proportion of Perfect Chromosomes |
| GA( | .70, .80 | 91.5 | 17 | 1.0 |
| GA(0, .001) | .63, .75 | 100 | 11 | 1.0 |
| GA(0, .002) | .61, .74 | 100 | 10 | 1.0 |
| GA(.002, 0) | .89, .93 | 22.5 | 19.5 | .86 |
| GA(.002, .001) | .88, .93 | 22 | 14.5 | .90 |
| GA(.002, .002) | .87, .91 | 22 | 8 | .86 |
| GA(.005, 0) | .88, .93 | 7 | 19 | .20 |
| GA(.005, .001) | .88, .93 | 7 | 11 | .20 |
| GA(.005, .002) | .87, .93 | 7 | 6 | .16 |
| GA(.008, 0) | .86, .91 | 5 | 18 | .02 |
| GA(.008, .001) | .86, .91 | 5 | 10 | .04 |
| GA(.008, .002) | .86, .91 | 5 | 4 | .02 |
| Best GA | .88, .93 | 7 | 12 | .24 |
| Boosting | .92, .95 | NA | NA | NA |
| PAM | .83, .86 | NA | NA | NA |
Percentiles relating to bias of repeated examinations of a test set
| Percentile | |||||
| 10 | 25 | 50 | 75 | 90 | |
| DS1, | |||||
| Bootstrap smallest error: | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| Original cohort error: | .01 | .01 | .02 | .03 | .05 |
| Estimated bias: | .003 | .012 | .020, | .032 | .051 |
| DS1, | |||||
| Bootstrap smallest error: | .02 | .02 | .03 | .05 | .06 |
| Original cohort error: | .07 | .07 | .08 | .11 | .13 |
| Estimated bias: | .02 | .04 | .05, | .07 | .10 |
| DS2, 700 < | |||||
| Bootstrap smallest error: | .01 | .02 | .03 | .04 | .05 |
| Original cohort error: | .04 | .05 | .06 | .08 | .10 |
| Estimated bias: | .01 | .03 | .04, | .06 | .07 |