| Literature DB >> 23146350 |
Maud Hw Starmans1, Melania Pintilie2, Thomas John3, Sandy D Der2, Frances A Shepherd4, Igor Jurisica5, Philippe Lambin6, Ming-Sound Tsao2, Paul C Boutros7.
Abstract
BACKGROUND: The advent of personalized medicine requires robust, reproducible biomarkers that indicate which treatment will maximize therapeutic benefit while minimizing side effects and costs. Numerous molecular signatures have been developed over the past decade to fill this need, but their validation and up-take into clinical settings has been poor. Here, we investigate the technical reasons underlying reported failures in biomarker validation for non-small cell lung cancer (NSCLC).Entities:
Year: 2012 PMID: 23146350 PMCID: PMC3580418 DOI: 10.1186/gm385
Source DB: PubMed Journal: Genome Med ISSN: 1756-994X Impact factor: 11.117
Figure 1Validation of three- and six-gene biomarkers. Previously published three-gene [20] and six-gene [13] prognostic biomarkers were validated in the Director's Challenge dataset [21]. (a, b) Each patient was classified into good (blue curves) or poor (red curves) prognosis groups using the three-gene (a) and six-gene (b) biomarker, which were visualized with Kaplan-Meier curves. Hazard ratios and P-values are from stage-adjusted Cox proportional hazard ratio modeling followed by the Wald test.
Figure 2Sub-stage analysis for three-gene biomarker and power analysis. (a-d) The performance of the three-gene biomarker was evaluated in a sub-stage analysis (stage IA (a), stage IB (b), stage II (c) and stage III (d) patients). Each patient was classified into good (blue curves) and poor (red curves) prognosis groups using the three-gene biomarker and were displayed with Kaplan-Meier curves. Hazard ratios and P-values are from Cox proportional hazard ratio modeling followed by the Wald test. (e) Subsequently, power calculations (assuming equal-sized groups) were performed at a range of HRs for all patients and for patients of specific stages. A threshold line is drawn for power of 0.8.
Figure 3Pre-processing influences biomarker validation. (a) Results for all Cox proportional hazard ratio modeling analysis for the 24 different pre-processing schemes in the Director's Challenge dataset [21] are summarized in Forest plots. (b) Classifications in the 24 different schedules are visualized in a heatmap. To confirm that biomarker performance and individual patient classification are highly dependent on dataset pre-processing, all pre-processing schedules were tested in a second dataset [29] for the three-gene biomarker. (c, d) Both biomarker performance (c) and individual patient classifications (d) were again influenced by differences in pre-processing. For the Forest plots; boxes and lines are the hazard ratios and 95% confidence intervals, respectively. For the heatmaps; white indicates a patient predicted to have good prognosis and black indicates a patient predicted to have poor prognosis. Colored sidebar displays the different pre-processing schemes as explained in the legends.
Figure 4Improved biomarker performance by accounting for classification robustness. (a-d) Marker performance improved dramatically when differentiating patients with identical classifications across all pre-processing schemes from the patients with ambiguous classifications for both the three-gene biomarker (a versus b) and the six-gene biomarker (c versus d) in the Director's Challenge dataset [21]. For all Kaplan-Meier curves; good prognosis patients are indicated by blue curves and poor prognosis patients by red curves. Hazard ratios and P-values are from stage-adjusted Cox proportional hazard ratio modeling followed by the Wald test.