| Literature DB >> 18635567 |
B Haibe-Kains1, C Desmedt, C Sotiriou, G Bontempi.
Abstract
MOTIVATION: Survival prediction of breast cancer (BC) patients independently of treatment, also known as prognostication, is a complex task since clinically similar breast tumors, in addition to be molecularly heterogeneous, may exhibit different clinical outcomes. In recent years, the analysis of gene expression profiles by means of sophisticated data mining tools emerged as a promising technology to bring additional insights into BC biology and to improve the quality of prognostication. The aim of this work is to assess quantitatively the accuracy of prediction obtained with state-of-the-art data analysis techniques for BC microarray data through an independent and thorough framework.Entities:
Mesh:
Year: 2008 PMID: 18635567 PMCID: PMC2553442 DOI: 10.1093/bioinformatics/btn374
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Characteristics of the risk prediction methods studied in this work
| Genotype | Dim. reduction | Structure | Learning algo. | Phenotype | |
|---|---|---|---|---|---|
| 1 | |||||
| 2 | BD | COMBUNIV | WILCOXON | HG | |
| 3 | BD | COMBUNIV | COX | SURV | |
| 4 | BD | MULTIV | LM | TOE | |
| 5 | BD | MULTIV | COX | SURV | |
| 6 | GW | RANK (CV) | COMBUNIV | WILCOXON | HG |
| 7 | GW | RANK (CV) | COMBUNIV | COX | SURV |
| 8 | GW | RANK (CV) | MULTIV | RCOX | SURV |
| 9 | GW | PCA (CV) | COMBUNIV | WILCOXON | HG |
| 10 | GW | PCA (CV) | COMBUNIV | COX | SURV |
| 11 | GW | PCA (CV) | MULTIV | RCOX | SURV |
| 12 | GENE76 | ||||
| 13 | GGI |
We will use the words in bold to refer to the models that were fully defined in previous publications. Otherwise, the model name is a concatenation of all its characteristics separated by ‘.’.
Specificity for a sensitivity of 90% for risk score prediction in the training set (VDX) and the three validation sets (TBG, TAM and UPP)
| Model | Specificity | |||
|---|---|---|---|---|
| VDX | TBG | TAM | UPP | |
| AURKA | 0.253 | 0.348 | 0.394 | 0.293 |
| BD.COMBUNIV.WILCOXON.HG | 0.247 | 0.311 | 0.362 | 0.258 |
| BD.COMBUNIV.COX.SURV | 0.268 | 0.360 | 0.394 | 0.293 |
| BD.MULTIV.LM.TOE | 0.268 | 0.460 | 0.220 | 0.217 |
| BD.MULTIV.COX.SURV | 0.205 | 0.118 | 0.372 | 0.131 |
| GW.RANK.COMBUNIV.WILCOXON.HG | 0.258 | 0.373 | 0.277 | 0.227 |
| GW.RANK.COMBUNIV.COX.SURV | 0.400 | 0.360 | 0.362 | 0.162 |
| GW.RANK.MULTIV.RCOX.SURV | 0.468 | 0.242 | 0.326 | 0.242 |
| GW.PCA.COMBUNIV.WILCOXON.HG | 0.147 | 0.298 | 0.067 | 0.091 |
| GW.PCA.COMBUNIV.COX.SURV | 0.426 | 0.379 | 0.450 | 0.217 |
| GW.PCA.MULTIV.RCOX.SURV | 0.405 | 0.509 | 0.358 | 0.141 |
| GENE76 | 0.626 | 0.391 | 0.309 | 0.088 |
| GGI | 0.258 | 0.522 | 0.422 | 0.308 |
aAs AURKA and GGI models were not fitted on VDX, this dataset can be considered as a validation set.
Performance for risk score prediction in the training set (VDX) and the three validation sets (TBG, TAM and UPP)
| Model | IAUC | IBSC | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| VDX | TBG | TAM | UPP | VDX | TBG | TAM | UPP | VDX | TBG | TAM | UPP | |
| KM | 0.189 | 0.145 | 0.141 | 0.151 | ||||||||
| AURKA | 0.636 | 0.609 | 0.683 | 0.637 | 0.636 | 0.601 | 0.674 | 0.63 | 0.144 | |||
| BD.COMBUNIV.WILCOXON.HG | 0.606 | 0.618 | 0.687 | 0.629 | 0.602 | 0.619 | 0.185 | 0.143 | 0.131 | 0.146 | ||
| BD.COMBUNIV.COX.SURV | 0.638 | 0.613 | 0.684 | 0.638 | 0.638 | 0.607 | 0.675 | 0.632 | 0.178 | 0.143 | 0.131 | 0.146 |
| BD.MULTIV.LM.TOE | 0.601 | 0.645 | 0.683 | 0.622 | 0.602 | 0.63 | 0.186 | 0.132 | 0.147 | |||
| BD.MULTIV.COX.SURV | 0.649 | 0.603 | 0.657 | 0.598 | 0.649 | 0.596 | 0.642 | 0.6 | 0.15 | 0.132 | 0.149 | |
| GW.RANK.COMBUNIV.WILCOXON.HG | 0.619 | 0.624 | 0.691 | 0.653 | 0.639 | 0.182 | 0.131 | 0.146 | ||||
| GW.RANK.COMBUNIV.COX.SURV | 0.65 | 0.637 | 0.638 | 0.153 | 0.158 | 0.172 | ||||||
| GW.RANK.MULTIV.RCOX.SURV | 0.663 | 0.638 | 0.63 | 0.635 | 0.151 | 0.175 | 0.16 | |||||
| GW.PCA.COMBUNIV.WILCOXON.HG | 0.586 | 0.591 | 0.566 | 0.579 | 0.617 | 0.616 | 0.565 | 0.561 | 0.186 | 0.136 | 0.148 | |
| GW.PCA.COMBUNIV.COX.SURV | 0.695 | 0.594 | 0.672 | 0.589 | 0.147 | 0.153 | 0.177 | |||||
| GW.PCA.MULTIV.RCOX.SURV | 0.69 | 0.591 | 0.667 | 0.598 | 0.155 | 0.171 | 0.176 | |||||
| GENE76 | 0.64 | 0.667 | 0.557 | 0.633 | 0.558 | 0.153 | 0.149 | 0.182 | ||||
| GGI | 0.613 | 0.67 | 0.611 | 0.183 | ||||||||
The accuracy measures in bold are significantly better than the accuracy of AURKA model. In case of IBSC, the accuracy measures of AURKA are in bold if they are significantly better than KM, the benchmark model, whatever the performance improvement.
aAs AURKA and GGI models were not fitted on VDX, this dataset can be considered as a validation set.
Sensitivity and specificity for risk group prediction in the training set (VDX) and the three validation sets (TBG, TAM and UPP)
| Model | Sensitivity | Specificity | ||||||
|---|---|---|---|---|---|---|---|---|
| VDX | TBG | TAM | UPP | VDX | TBG | TAM | UPP | |
| AURKA | 0.802 | 0.892 | 0.900 | 0.806 | 0.389 | 0.379 | 0.365 | 0.354 |
| BD.COMBUNIV.WILCOXON.HG | 0.792 | 0.892 | 0.880 | 0.806 | 0.389 | 0.379 | 0.365 | 0.354 |
| BD.COMBUNIV.COX.SURV | 0.812 | 0.892 | 0.900 | 0.833 | 0.400 | 0.379 | 0.369 | 0.359 |
| BD.MULTIV.LM.TOE | 0.833 | 0.946 | 0.840 | 0.778 | 0.411 | 0.391 | 0.358 | 0.348 |
| BD.MULTIV.COX.SURV | 0.792 | 0.784 | 0.900 | 0.806 | 0.389 | 0.354 | 0.369 | 0.354 |
| GW.RANK.COMBUNIV.WILCOXON.HG | 0.812 | 0.892 | 0.840 | 0.833 | 0.400 | 0.379 | 0.358 | 0.359 |
| GW.RANK.COMBUNIV.COX.SURV | 0.885 | 0.892 | 0.860 | 0.778 | 0.437 | 0.379 | 0.362 | 0.348 |
| GW.RANK.MULTIV.RCOX.SURV | 0.927 | 0.892 | 0.840 | 0.806 | 0.458 | 0.379 | 0.358 | 0.354 |
| GW.PCA.COMBUNIV.WILCOXON.HG | 0.740 | 0.838 | 0.760 | 0.750 | 0.363 | 0.366 | 0.344 | 0.343 |
| GW.PCA.COMBUNIV.COX.SURV | 0.896 | 0.892 | 0.940 | 0.778 | 0.442 | 0.379 | 0.376 | 0.348 |
| GW.PCA.MULTIV.RCOX.SURV | 0.896 | 0.919 | 0.880 | 0.722 | 0.442 | 0.385 | 0.365 | 0.338 |
| GENE76 | 0.958 | 0.919 | 0.840 | 0.722 | 0.474 | 0.385 | 0.358 | 0.335 |
| GGI | 0.844 | 1.000 | 0.900 | 0.861 | 0.416 | 0.404 | 0.369 | 0.359 |
aAs AURKA and GGI models were not fitted on VDX, this dataset can be considered as a validation set.
Performance for risk group prediction in the training set (VDX) and the three validation sets (TBG, TAM and UPP)
| Model | HR | IBSC | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| VDX | TBG | TAM | UPP | VDX | TBG | TAM | UPP | VDX | TBG | TAM | UPP | |
| KM | 0.189 | 0.145 | 0.141 | 0.151 | ||||||||
| AURKA | 0.685 | 0.729 | 0.834 | 0.673 | 2.04 | 2.43 | 4.64 | 1.84 | ||||
| BD.COMBUNIV.WILCOXON.HG | 0.675 | 0.728 | 0.804 | 0.673 | 1.86 | 2.39 | 4.06 | 1.89 | 0.184 | 0.14 | 0.134 | 0.147 |
| BD.COMBUNIV.COX.SURV | 0.698 | 0.729 | 0.834 | 2.17 | 2.43 | 4.58 | 0.181 | 0.141 | 0.133 | 0.146 | ||
| BD.MULTIV.LM.TOE | 0.721 | 0.811 | 0.716 | 0.647 | 2.26 | 3.7 | 2.52 | 1.77 | 0.138 | 0.149 | ||
| BD.MULTIV.COX.SURV | 0.685 | 0.611 | 0.828 | 0.66 | 2.21 | 1.59 | 4.89 | 1.84 | 0.182 | 0.146 | 0.132 | 0.148 |
| GW.RANK.COMBUNIV.WILCOXON.HG | 0.694 | 0.785 | 0.775 | 0.733 | 1.99 | 3.61 | 3.42 | 2.42 | 0.182 | 0.136 | ||
| GW.RANK.COMBUNIV.COX.SURV | 0.77 | 0.778 | 0.632 | 2.96 | 3.53 | 1.53 | 0.143 | 0.139 | 0.156 | |||
| GW.RANK.MULTIV.RCOX.SURV | 0.765 | 0.749 | 0.696 | 3.28 | 3 | 2.18 | 0.15 | 0.147 | 0.157 | |||
| GW.PCA.COMBUNIV.WILCOXON.HG | 0.616 | 0.69 | 0.589 | 0.586 | 1.46 | 1.94 | 1.3 | 1.37 | 0.187 | 0.142 | 0.14 | 0.15 |
| GW.PCA.COMBUNIV.COX.SURV | 0.734 | 0.909 | 0.63 | 2.62 | 9.5 | 1.53 | 0.147 | 0.133 | 0.174 | |||
| GW.PCA.MULTIV.RCOX.SURV | 0.749 | 0.818 | 0.564 | 2.6 | 4.64 | 1.15 | 0.142 | 0.136 | 0.177 | |||
| GENE76 | 0.756 | 0.754 | 0.548 | 2.79 | 3.52 | 1.16 | 0.146 | 0.145 | 0.17 | |||
| GGI | 0.706 | 0.824 | 2.12 | 4.03 | 0.181 | 0.134 | ||||||
The accuracy measures in bold are significantly better than the accuracy of AURKA model. In case of IBSC, the accuracy measures of AURKA are in bold if they are significantly better than KM, the benchmark model, whatever the performance improvement.
aAs AURKA and GGI models were not fitted on VDX, this dataset can be considered as a validation set.