| Literature DB >> 22194822 |
Carmen D Schweighofer1, Kevin R Coombes, Lynn L Barron, Lixia Diao, Rachel J Newman, Alessandra Ferrajoli, Susan O'Brien, William G Wierda, Rajyalakshmi Luthra, L Jeffrey Medeiros, Michael J Keating, Lynne V Abruzzo.
Abstract
We developed and validated a two-gene signature that predicts prognosis in previously-untreated chronic lymphocytic leukemia (CLL) patients. Using a 65 sample training set, from a cohort of 131 patients, we identified the best clinical models to predict time-to-treatment (TTT) and overall survival (OS). To identify individual genes or combinations in the training set with expression related to prognosis, we cross-validated univariate and multivariate models to predict TTT. We identified four gene sets (5, 6, 12, or 13 genes) to construct multivariate prognostic models. By optimizing each gene set on the training set, we constructed 11 models to predict the time from diagnosis to treatment. Each model also predicted OS and added value to the best clinical models. To determine which contributed the most value when added to clinical variables, we applied the Akaike Information Criterion. Two genes were consistently retained in the models with clinical variables: SKI (v-SKI avian sarcoma viral oncogene homolog) and SLAMF1 (signaling lymphocytic activation molecule family member 1; CD150). We optimized a two-gene model and validated it on an independent test set of 66 samples. This two-gene model predicted prognosis better on the test set than any of the known predictors, including ZAP70 and serum β2-microglobulin.Entities:
Mesh:
Substances:
Year: 2011 PMID: 22194822 PMCID: PMC3237436 DOI: 10.1371/journal.pone.0028277
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Clinical and Laboratory Features.*
| All Patients(n = 131) | Training(n = 65) | Test Set(n = 66) | p value | ||
|
|
| 56.7(26.7, 80.9) | 56.1(26.7, 80.1) | 58.5(37.6, 80.9) | 0.12 |
|
|
| 81 (61.8)50 (38.2) | 42 (64.6)23 (35.4) | 39 (59.1)27 (40.9) | 0.59 |
|
|
| 102 (77.9)29 (22.1) | 54 (83.1)11 (16.9) | 48 (72.7)18 (27.3) | 0.21 |
|
|
| 126 (95.5)5 (4.5) | 63 (96.9)2 (3.1) | 63 (95.5)3 (4.5) | 0.99 |
|
|
| 118 (90.1)13 (9.9) | 63 (96.9)2 (3.1) | 55 (83.3)11 (16.7) | 0.02 |
|
|
| 12.9(8.2, 17.4) | 12.9(10.0, 17.0) | 12.9(8.2, 17.4) | 0.59 |
|
|
| 3.0(0.0, 21.0) | 4.0(0.0, 21.0) | 2.5(0.0, 15.0) | 0.19 |
|
|
| 168(39, 476) | 168(39, 379) | 169(60, 476) | 0.31 |
|
|
| 526(274, 1818) | 527(338, 1313) | 517(274,1818) | 0.10 |
|
|
| 98 (75.4)32 (24.6) | 45 (70.3)19 (29.7) | 53 (80.3)13 (91.7) | 0.22 |
|
|
| 69 (58.5)49 (41.5) | 30 (52.6)27 (47.4) | 39 (63.9)22 (36.1) | 0.26 |
|
|
| 32 (28.3)81 (71.7) | 18 (31.6)39 (68.4) | 14 (25.0)42 (75.0) | 0.53 |
|
|
| 78 (62.4)47 (37.6) | 36 (60.0)24 (40.0) | 42 (64.6)23 (35.4) | 0.71 |
|
|
| 98 (79.0)26 (21.0) | 50 (83.3)10 (16.7) | 48 (75.0)16 (25.0) | 0.28 |
|
|
| 67 (51.5)63 (48.5) | 36 (55.4)29 (44.6) | 31 (47.7)34 (52.3) | 0.48 |
|
|
| 62 (54.9)51 (45.1) | 32 (59.3)22 (40.7) | 30 (50.8)29 (49.2) | 0.45 |
|
|
| 82 (87.2)12 (12.8) | 41 (93.2)3 (6.8) | 41 (82.0)9 (18.0) | 0.13 |
|
|
| 101.3(9.6, 271.3) | 109.2(9.6–252.4) | 93.1(28.3–271.3) | 0.46 |
|
|
| 27.5(0.7–211.5) | 26.0(0.7–151.9) | 29.2(1.0–211.5) | 0.77 |
|
|
| 28.8(0.7, 211.5) | 28.0(0.7–198.3) | 30.7(1.0–211.5) | 0.84 |
|
|
| 100 (76.3) | 45 (69.2) | 55 (83.3) | 0.10 |
Abbreviations: WBC, white blood cell; Hgb, hemoglobin; Prolymphs, prolymphocytes; LDH, serum lactate dehydrogenase; β2M, serum β2 microglobulin; serum Ig, serum immunoglobulin levels; surface IgL, surface immunoglobulin light chain isotype; IGHV SM status, immunoglobulin heavy chain variable region gene somatic mutation status.
*Rai stage, splenomegaly, WBC count, Hgb, Prolymphs, Platelets, LDH, β2M, serum Ig were determined at the time the sample was obtained. The CLL score, surface IgL, IGHV SM status, ZAP70, and karyotype were determined on samples obtained before treatment was initiated.
All p values were calculated using the two-sided Fisher's Exact test except for age in years, which was calculated using the two-sided t-test, and time-to-event parameters (log-rank test).
The normal ranges are: WBC, 4–11×109/L; Hgb, 14.0–18.0 g/dL, platelets, 140–440×109/L; LDH, 313–618 U/L; β2M, 0.7–1.8 mg/L; serum Ig, IgM 29–214 g/dL, IgA 74–327 g/dL, IgG 624–1680 g/dL. Serum Ig are considered decreased (hypogammaglobulinemia) if ≥2 immunoglobulin fractions are below the normal range. CD38 is low if <30% of CD19+ cells express CD38, and high if ≥30% of CD19+ express CD38. Karyotypes are considered simple if the number of abnormalities is <3, and complex if the number is ≥3.
Ability of genes to predict time-to-treatment on the training set.
| Cox proportional hazards | Cross-validation | ||||||
| Log-rank p-values | % of times selected | ||||||
| Time from Diagnosis | Time from Sample Collection | Diagnosis | Sample | ||||
| Dichotomous | Continuous | Dichotomous | Continuous | Univariate | Multivariate | ||
| SKI |
|
|
|
| 96.7 | 68.3 | 77.3 |
| NT5C2 |
|
|
|
| 91.3 | 34.0 | 43.3 |
| AICDA |
|
|
|
| 83.7 | 13.3 | 14.3 |
| SLAMF1 |
|
|
|
| 82.7 | 45.3 | 26.0 |
| CD14 |
|
|
|
| 76.3 | 12.7 | 58.3 |
| FGL2 | 0.05126 |
|
|
| 75.7 | 18.3 | 23.7 |
| NUDC |
|
|
|
| 58.7 | 16.0 | 8.0 |
| NRIP1 | 0.09115 |
|
|
| 53.3 | 7.0 | 26.0 |
| EGR3 | 0.11961 |
|
|
| 47.0 | 15.0 | 9.7 |
| OAS3 |
|
|
|
| 43.7 | 18.0 | 40.7 |
| MLXIP |
|
|
|
| 41.3 | 23.7 | 7.7 |
| TPST2 |
| 0.11673 |
|
| 39.3 | 8.7 | 6.7 |
| GZMK | 0.10461 | 0.47112 |
|
| 19.7 | 3.7 | 19.0 |
| TRIB2 |
|
|
|
| 18.7 | 4.3 | 3.3 |
| BLNK | 0.21545 |
| 0.17736 |
| 15.3 | 10.7 | 9.7 |
| ATF4 |
|
| 0.06964 | 0.14722 | 9.0 | 8.3 | 3.7 |
| ZAP70 | 0.05281 | 0.72270 |
| 0.20473 | 3.0 | 1.3 | 0.3 |
| CCL5 | 0.31346 | 0.20183 | 0.11853 | 0.10792 | 2.7 | 2.7 | 2.7 |
| FLNB | 0.33042 | 0.16040 | 0.12236 | 0.07366 | 2.0 | 1.0 | 0.0 |
| FGFR1 | 0.20586 | 0.13564 | 0.19548 | 0.22199 | 2.0 | 2.0 | 2.0 |
| ZBTB20 | 0.32892 | 0.26458 | 0.08767 | 0.05625 | 1.0 | 0.7 | 1.0 |
| GFI1 | 0.12424 | 0.27589 | 0.11658 | 0.23475 | 0.7 | 0.0 | 0.0 |
| ATRX | 0.53365 | 0.21520 | 0.23976 | 0.12142 | 0.3 | 0.3 | 0.3 |
| SEPT10 | 0.84619 | 0.57552 | 0.61241 | 0.07739 | 0.3 | 0.0 | 0.3 |
| LPL | 0.30342 | 0.19845 | 0.19664 | 0.08380 | 0.3 | 0.0 | 0.0 |
| WSB2 | 0.42500 | 0.97415 | 0.15628 | 0.38180 | 0.0 | 0.0 | 0.0 |
| TNFRSF8 | 0.60838 | 0.69192 | 0.28498 | 0.52499 | 0.0 | 0.0 | 0.0 |
| RIOK2 | 0.83187 | 0.30138 | 0.96935 | 0.16604 | 0.0 | 0.0 | 0.0 |
| P2RX1 | 0.28422 | 0.20501 | 0.15201 | 0.06186 | 0.0 | 0.0 | 0.0 |
| LDOC1 | 0.48194 | 0.97673 | 0.15223 | 0.62795 | 0.0 | 0.0 | 0.0 |
| LASS6 | 0.72274 | 0.72844 | 0.48064 | 0.52066 | 0.0 | 0.0 | 0.0 |
| CRY1 | 0.27369 | 0.44715 | 0.12731 | 0.23697 | 0.0 | 0.0 | 0.0 |
| COBLL1 | 0.59439 | 0.53532 | 0.77744 | 0.86219 | 0.0 | 0.0 | 0.0 |
| CD86 | 0.94053 | 0.82897 | 0.84059 | 0.18854 | 0.0 | 0.0 | 0.0 |
| BCL7A | 0.99125 | 0.79130 | 0.46758 | 0.57526 | 0.0 | 0.0 | 0.0 |
| BANK1 | 0.98999 | 0.36153 | 0.60229 | 0.81261 | 0.0 | 0.0 | 0.0 |
| ANXA2 | 0.55803 | 0.44614 | 0.79069 | 0.65462 | 0.0 | 0.0 | 0.0 |
Figure 1Cross-validation of gene predictors on the training set.
(A) Histogram of the number of times each gene was statistically significant among 300 assessments of univariate models. (B) Histogram of the number of times each gene was retained in a multivariate model to predict time-to-treatment either from diagnosis or from sample collection (300 each). (C) Sorted probabilities that a gene was significant in a univariate model. Gaps in the figure identify a six-gene and a twelve-gene subset. (D) Sorted probabilities that a gene was retained in a multivariate model. Gaps in the figure identify a five-gene and a thirteen-gene subset.
Genes retained in multivariate models to predict time from diagnosis to treatment in the training dataset.
| Gene Set | M6 | M6 | M6,SAM | M5 | M6,SAM | M5 | M12 | M6 | M13 | M12 | M13 |
| Optimizer | AIC | BIC | AIC | AIC,BIC | BIC | AIC,BIC | AIC,BIC | ||||
| SKI | Yes (c) | Yes (c) | Yes | Yes (c) | Yes (c) | Yes (c) | Yes (c) | Yes (c) | Yes (c) | Yes (c) | Yes |
| NT5C2 | Yes | Yes | Yes | Yes | Yes | Yes | – | Yes | – | Yes | Yes |
| SLAMF1 | – | Yes (c) | – | – | Yes (c) | Yes (c) | Yes (c) | Yes (c) | Yes (c) | Yes (c) | Yes (c) |
| CD14 | – | – | Yes | – | Yes | Yes | – | Yes | – | Yes | Yes |
| FGL2 | – | – | – | – | – | – | Yes | Yes | Yes | Yes | Yes |
| OAS3 | – | – | – | Yes | – | Yes | – | – | – | Yes | Yes |
| NUDC | – | – | – | – | – | – | Yes | – | Yes | Yes | Yes |
| MLXIP | – | – | – | – | – | – | Yes | – | Yes | Yes | Yes |
| AICDA | – | – | – | – | – | – | – | Yes | – | Yes | Yes |
| NRIP1 | – | – | – | – | – | – | – | – | – | Yes | Yes |
| EGR3 | – | – | – | – | – | – | – | – | – | Yes | Yes |
| BLNK | – | – | – | – | – | – | – | – | Yes | – | Yes |
| TPST2 | – | – | – | – | – | – | – | – | – | Yes | – |
| GZMK | – | – | – | – | – | – | – | – | – | – | Yes (c) |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Abbreviations: M, model; AIC, Akaike Information Criterion; BIC, Bayes Information Criterion; SAM, sample collection; CM, clinical model.
*Four different sets of genes were chosen as starting points, containing 5, 6, 12, or 13 genes.
We applied stepwise forward-backward methods to each gene set to optimize either AIC or BIC.
Two gene sets optimized predictions of the time from sample collection to treatment; all others looked at the time from diagnosis to treatment.
Log rank P value to test, via Cox proportional hazards, if a continuous score derived to predict time-to-treatment also predicts overall survival.
P-value computed from a chi-squared test of whether the continuous score adds value to the existing clinical predictors.
(c) Individual genes that remained significant when added to the existing clinical predictors.
Figure 2Kaplan-Meier plots of the time from diagnosis to treatment stratified by the gene prognostic (GP) score (a linear combination of the expression of SKI and SLAMF1) in (A) the training set and (B) the validation set.
Figure 3Kaplan-Meier plots of time from diagnosis to treatment in the validation set, showing the interactions between the SKI-SLAMF1 score and (A) gender, (B) Rai stage, (C) WBC count, (D) serum β2-microglobulin level, (E) CD38, (F) IGHV somatic mutation status, (G) ZAP70, (H) cytogenetic complexity, and (I) clinical score.