| Literature DB >> 30517129 |
Ming Yi1, Ruoqing Zhu2, Robert M Stephens1.
Abstract
Accurate assessment of the association between continuous variables such as gene expression and survival is a critical aspect of precision medicine. In this report, we provide a review of some of the available survival analysis and validation tools by referencing published studies that have utilized these tools. We have identified pitfalls associated with the assumptions inherent in those applications that have the potential to impact scientific research through their potential bias. In order to overcome these pitfalls, we have developed a novel method that enables the logrank test method to handle continuous variables that comprehensively evaluates survival association with derived aggregate statistics. This is accomplished by exhaustively considering all the cutpoints across the full expression gradient. Direct side-by-side comparisons, global ROC analysis, and evaluation of the ability to capture relevant biological themes based on current understanding of RAS biology all demonstrated that the new method shows better consistency between multiple datasets of the same disease, better reproducibility and robustness, and better detection power to uncover biological relevance within the selected datasets over the available survival analysis methods on univariate gene expression and penalized linear model-based methods.Entities:
Mesh:
Year: 2018 PMID: 30517129 PMCID: PMC6281197 DOI: 10.1371/journal.pone.0207590
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Comparison of shared genes identified by selected methods on two independent PAAD datasets.
| TCGAPAAD_Set | AusPanc_Set | Adj.GoodCountPvals | Adj.CORRECTED_P_VALUE | Adj.CoxPvalbyRanks | Adj.COX_P_VALUE | Adj.MedianPvals | Adj.tertPvals |
|---|---|---|---|---|---|---|---|
| trial_1 | trial_1 | BUB1;MCM4 | EGFR;BUB1;CCNA2;MCM4;E2F7;CDK2 | EGFR;BUB1;CCNA2;MCM4;E2F7;CDK2;RASA1 | BUB1;CCNA2;MCM4 | CCNA2;MCM4 | |
| trial_1 | trial_3 | BUB1;MCM4 | EGFR;BUB1;CCNA2;MCM4;E2F7;CDK2 | EGFR;BUB1;CCNA2;MCM4;E2F7;CDK2;RASA1 | BUB1;CCNA2;MCM4 | CCNA2;MCM4 | |
| trial_1 | trial_4 | BUB1;MCM4 | EGFR;BUB1;CCNA2;MCM4;E2F7;CDK2 | EGFR;BUB1;CCNA2;MCM4;E2F7;CDK2;RASA1 | BUB1;CCNA2;MCM4 | CCNA2;MCM4 | |
| trial_2 | trial_1 | BUB1;MCM4 | BUB1;CCNA2;MCM4;EGFR;E2F7;CDK2 | BUB1;CCNA2;MCM4;EGFR;E2F7;CDK2;RASA1 | BUB1;CCNA2;MCM4 | CCNA2;MCM4 | |
| trial_2 | trial_3 | BUB1;MCM4 | BUB1;CCNA2;MCM4;EGFR;E2F7;CDK2 | BUB1;CCNA2;MCM4;EGFR;E2F7;CDK2;RASA1 | BUB1;CCNA2;MCM4 | CCNA2;MCM4 | |
| trial_2 | trial_4 | BUB1;MCM4 | BUB1;CCNA2;MCM4;EGFR;E2F7;CDK2 | BUB1;CCNA2;MCM4;EGFR;E2F7;CDK2;RASA1 | BUB1;CCNA2;MCM4 | CCNA2;MCM4 | |
| trial_3 | trial_1 | BUB1;MCM4 | EGFR;BUB1;CCNA2;E2F7;MCM4;CDK2 | EGFR;BUB1;CCNA2;E2F7;MCM4;CDK2;RASA1 | BUB1;CCNA2;MCM4 | CCNA2;MCM4 | |
| trial_3 | trial_3 | BUB1;MCM4 | EGFR;BUB1;CCNA2;E2F7;MCM4;CDK2 | EGFR;BUB1;CCNA2;E2F7;MCM4;CDK2;RASA1 | BUB1;CCNA2;MCM4 | CCNA2;MCM4 | |
| trial_3 | trial_4 | BUB1;MCM4 | EGFR;BUB1;CCNA2;E2F7;MCM4;CDK2 | EGFR;BUB1;CCNA2;E2F7;MCM4;CDK2;RASA1 | BUB1;CCNA2;MCM4 | CCNA2;MCM4 | |
| trial_4 | trial_1 | BUB1;MCM4 | EGFR;BUB1;CCNA2;MCM4;E2F7;CDK2 | EGFR;BUB1;CCNA2;MCM4;E2F7;CDK2;RASA1 | BUB1;CCNA2;MCM4 | CCNA2;MCM4 | |
| trial_4 | trial_3 | BUB1;MCM4 | EGFR;BUB1;CCNA2;MCM4;E2F7;CDK2 | EGFR;BUB1;CCNA2;MCM4;E2F7;CDK2;RASA1 | BUB1;CCNA2;MCM4 | CCNA2;MCM4 | |
| trial_4 | trial_4 | BUB1;MCM4 | EGFR;BUB1;CCNA2;MCM4;E2F7;CDK2 | EGFR;BUB1;CCNA2;MCM4;E2F7;CDK2;RASA1 | BUB1;CCNA2;MCM4 | CCNA2;MCM4 | |
| trial_5 | trial_1 | BUB1;MCM4 | EGFR;BUB1;CCNA2;E2F7;MCM4;CDK2 | EGFR;BUB1;CCNA2;E2F7;MCM4;CDK2;RASA1 | BUB1;CCNA2;MCM4 | CCNA2;MCM4 | |
| trial_5 | trial_3 | BUB1;MCM4 | EGFR;BUB1;CCNA2;E2F7;MCM4;CDK2 | EGFR;BUB1;CCNA2;E2F7;MCM4;CDK2;RASA1 | BUB1;CCNA2;MCM4 | CCNA2;MCM4 | |
| trial_5 | trial_4 | BUB1;MCM4 | EGFR;BUB1;CCNA2;E2F7;MCM4;CDK2 | EGFR;BUB1;CCNA2;E2F7;MCM4;CDK2;RASA1 | BUB1;CCNA2;MCM4 | CCNA2;MCM4 |
(TCGA PAAD data and Australian Cancer Data). Only the trials that have significant p-values in their one-tailed Fisher’s exact tests for the GradientScanSurv (Adj.GoodCountPvals) method (rows in bold from Table K in ) assessing the significance of overlapping genes are listed here. All other methods have the same p-values (for the same data) and so the same shared genes for all trials. The GradientScanSurv method produced different p-values for each trial due to its permutation step. The full table with all trials is in Table L in . Identified genes shared by both the TCGA PAAD data (TCGAPAAD_Set) and the Australian Pancreatic Cancer data (AusPanc_Set) at the cutoff using the adjusted p-value< = 0.05 respectively using the selected methods are shown. Since the GradientScanSurv method has a permutation step, multiple trials of GradientScanSurv analysis were used in combination with results from other methods to form trials. In addition, whenever the NRAS gene was identified in the trials of data by the GradientScanSurv method (all trials in this table), the overlap of the identified genes showed the significance of overlap between the two datasets for the GradientScanSurv method, which was also highlighted in bold in Table K in
Comparison of results by GradientScanSurv and Lasso methods on TCGA tumor data (adjusted p< = 0.05).
| Types | Common_Genes_Count | GSS_Genes_Count | Lasso_Genes_Count | Common_Genes | GSS_Genes_only | Lasso_Genes_Only |
|---|---|---|---|---|---|---|
| BRCA | 8 | 17 | 28 | CCNA1;ERF;EXOC1;FLT3;ICMT;PLXNB1;RAC2;STK3 | CCND2;IRS2;JUN;PTK2;PTPN11;ROCK2;STK11; | SAV1;RASAL1;EIF4EBP1;PIK3CA;RASAL3;FGFR1;CDKN1A;RHOB;DUSP6;CASP7;FGFR4;RPS6KB1;RPS6KA3;PRKAG1;RPS6KA6;RALGDS;SCRIB;PRKAA2;CASP3;RHEB |
| COAD | 0 | 0 | 0 | |||
| LUAD | 4 | 6 | 12 | CCNA2; | E2F7;GRB10;TYMS;RALGDS;YAP1;ALK;PIK3CA;RASSF9 | |
| PAAD | 2 | 64 | 2 | EXOC7;MET | ||
| READ | 0 | 1 | 0 |
Comparison of results derived from GradientScanSurv (GSS) and Lasso methods with TCGA tuor data (adjusted p< = 0.05) for the association of survival outcome with the expression of RAS pathway genes for a few selected tumor types: BRCA, COAD, LUAD, PAAD, and READ, which more likely have RAS genes involved. For GSS gene lists, LUAD, PAAD, and BRCA used multiple trials of results—the genes shown here were selected as common genes shared by multiple trials: LUAD (4 of 5 trials); PAAD (4 of 5 trials); BRCA (2 of 3 trials). Genes in bold are discussed in the text.