| Literature DB >> 20064235 |
Vlad Popovici1, Weijie Chen, Brandon G Gallas, Christos Hatzis, Weiwei Shi, Frank W Samuelson, Yuri Nikolsky, Marina Tsyganova, Alex Ishkin, Tatiana Nikolskaya, Kenneth R Hess, Vicente Valero, Daniel Booser, Mauro Delorenzi, Gabriel N Hortobagyi, Leming Shi, W Fraser Symmans, Lajos Pusztai.
Abstract
INTRODUCTION: As part of the MicroArray Quality Control (MAQC)-II project, this analysis examines how the choice of univariate feature-selection methods and classification algorithms may influence the performance of genomic predictors under varying degrees of prediction difficulty represented by three clinically relevant endpoints.Entities:
Mesh:
Substances:
Year: 2010 PMID: 20064235 PMCID: PMC2880423 DOI: 10.1186/bcr2468
Source DB: PubMed Journal: Breast Cancer Res ISSN: 1465-5411 Impact factor: 6.466
Patient characteristics in the training and validation sets
| Training set ( | Validation set ( | ||
|---|---|---|---|
| Median age | 51 years (28-79 years) | 50 years (26-73 years) | |
| Race | 0.804 | ||
| Caucasian | 85 (65%) | 68 (68%) | |
| African American | 13 (10%) | 12 (12%) | |
| Asian | 9 (7%) | 7 (7%) | |
| Hispanic | 21 (16%) | 13 (13%) | |
| Mixed | 2 (2%) | 0 | |
| Cancer histology | 0.047 | ||
| Invasive ductal (IDC) | 119 (92%) | 85 (85%) | |
| Mixed ductal/lobular (IDC/ILC) | 8 (6%) | 8 (8%) | |
| Invasive lobular (ILC) | 1 (0.7%) | 7 (7%) | |
| Others | 2 (1.3%) | 0 | |
| Tumor size | 0.643 | ||
| T0 | 1 (1%) | 2 (2%) | |
| T1 | 12 (9%) | 8 (8%) | |
| T2 | 70 (54%) | 62 (62%) | |
| T3 | 21 (16%) | 13 (13%) | |
| T4 | 26 (20%) | 15 (15%) | |
| Lymph node stage | 0.935 | ||
| N0 | 39 (30%) | 27 (27%) | |
| N1 | 60 (46%) | 47 (47%) | |
| N2 | 14 (11%) | 13 (13%) | |
| N3 | 17 (13%) | 13 (13%) | |
| Nuclear grade (BMN) | 0.005 | ||
| 1 | 2 (2%) | 11 (11%) | |
| 2 | 52 (40%) | 42 (42%) | |
| 3 | 76 (58%) | 47 (47%) | |
| Estrogen receptor | 0.813 | ||
| Estrogen receptor positive | 80 (62%) | 60 (60%) | |
| Estrogen receptor negative | 50 (38%) | 40 (40%) | |
| HER-2 | < 0.001 | ||
| HER-2 positive | 33 (25%) | 7 (7%) | |
| HER-2 negative | 96 (74%) | 93 (93%) | |
| Neoadjuvant therapy | 0.005 | ||
| Weekly T × 12 + FAC × 4 | 112 (86%) | 98 (98%) | |
| 3-Weekly T × 4 + FAC × 4 | 18 (14%) | 2 (2%) | |
| Pathologic complete response (pCR) | 33 (25%) | 15 (15%) | 0.055 |
Estrogen receptor: cases in which more than 10% of tumor cells stained positive for ER with immunohistochemistry (IHC) were considered positive. HER-2: cases that showed either 3+ IHC staining or had gene copy number greater than 2.0 were considered HER-2 "positive." T = paclitaxel; FAC = 5-fluorouracil, doxorubicin, and cyclophosphamide. The P values for the association tests were obtained from a χ2 test unless the number of cases was fewer than five in any category, in which case, Fisher's Exact test was used.
Figure 1Relative complexity of the three prediction problems. The cumulative information values have been scaled such that the maximum value is 1. To make the curves comparable and to take into account the sample size, the ratio between the number of features used in the cumulative information (F) and the sample size is used on the horizontal axis. Larger values of the cumulative information indicate simpler problems.
Figure 2Boxplots of the estimated area under the curve (AUC), stratified by feature-selection and classification methods. The boxplots show the mean AUC in 10 times fivefold cross validation (CV). The left column contains the estimated AUC stratified by the feature-selection method, and the right column contains the estimated AUC stratified by the classification method.
Figure 3Graphic summaries of the estimated and observed areas under the curve (AUCs) for each of the 120 models. For each combination of feature-selection method and classification algorithm, the AUCs ± 2 standard deviations are plotted. Mean AUCs obtained from 10 × 5-CV (cross-validation; black square), LPO bootstrap (black dot), and the conditional (blue circle) and mean (red cross) validation AUCs are shown.
Figure 4Learning curves for the best predictors for each of the three endpoints. For each endpoint, the learning curve of the best-performing model on the validation set was estimated by fivefold cross-validation for gradually increasing sample sizes. The plot shows both the estimated performance for different sample sizes and the fitted curve. The quadratic discriminant analysis (QDA) classifier required more than 60 samples, so the minimum sample size for it was 80. Note the nonlinear scale of the x-axis.
Congruencies across different endpoints and different feature-selection methods
| Same endpoint but different feature selection (FS) | ||
|---|---|---|
| Endpoint | Gene-level | Level of canonic-pathway maps |
| ER status | 0.541 | 0.573 |
| pCR | 0.544 | 0.572 |
| pCR(ER-) | 0.593 | 0.532 |
| FS1 | 0.300 | 0.290 |
| FS2 | 0.299 | 0.274 |
| FS3 | 0.291 | 0.278 |
| FS4 | 0.295 | 0.291 |
| FS5 | 0.272 | 0.282 |
The table shows that kappa statistics (that is, congruency) are high for different feature-selection methods for the same endpoint but are low for the same feature-ranking method for different endpoints. Both gene-level and pathway-level analyses show similar results.