| Literature DB >> 30454048 |
Yifan Zhang1, William Yang2, Dan Li1, Jack Y Yang1, Renchu Guan1, Mary Qu Yang3.
Abstract
BACKGROUND: Breast cancer is the most common type of invasive cancer in woman. It accounts for approximately 18% of all cancer deaths worldwide. It is well known that somatic mutation plays an essential role in cancer development. Hence, we propose that a prognostic prediction model that integrates somatic mutations with gene expression can improve survival prediction for cancer patients and also be able to reveal the genetic mutations associated with survival.Entities:
Keywords: Breast Cancer; Precision survival prediction; Somatic mutations; Survival analysis; Whole genome-wide expression
Mesh:
Substances:
Year: 2018 PMID: 30454048 PMCID: PMC6245494 DOI: 10.1186/s12920-018-0419-x
Source DB: PubMed Journal: BMC Med Genomics ISSN: 1755-8794 Impact factor: 3.063
Fig. 1The survival analysis using different type of features. The Kaplan–Meier curves for 330 significant univariate genes (a), 118 significant univariate genes with hazard ratios higher than 1.221(b), 88 genes that select by GA (c). d showed the C-index for these three models
Fig. 2The hazard ratio values and the Kaplan–Meier estimator for an independent patient dataset. a The hazard ratios of 330 significant univariate genes (yellow) and 88 genes (blue) selected by GA. b The Kaplan–Meier curve using 61 of 88 GA selected genes for a METABRIC breast cancer patient dataset
Fig. 3Distribution of mutation rates and the correlation with univariate P-values. a The distribution of mutation rates for genes and SNPs in the TCGA breast cancer patients. b The correlation between univariate P-value and mutation rate for top 35 mutated genes
The top mutated genes in the TCGA 1044 breast cancer dataset
| Gene A | Gene B | A Not B | B Not A | Both | Log Odds Ratio | Adjusted | Tendency |
|---|---|---|---|---|---|---|---|
|
|
| 338 | 125 | 9 | −2.067 | < 0.001 | Mutual exclusivity |
|
|
| 336 | 116 | 11 | −1.771 | < 0.001 | Mutual exclusivity |
|
|
| 336 | 79 | 11 | −1.327 | < 0.001 | Mutual exclusivity |
|
|
| 265 | 266 | 82 | −0.641 | < 0.001 | Mutual exclusivity |
|
|
| 284 | 70 | 64 | 0.735 | 0.002 | Co-occurrence |
|
|
| 264 | 104 | 83 | 0.62 | 0.004 | Co-occurrence |
|
|
| 296 | 55 | 52 | 0.75 | 0.006 | Co-occurrence |
|
|
| 154 | 74 | 33 | 0.846 | 0.007 | Co-occurrence |
|
|
| 303 | 45 | 45 | 0.798 | 0.008 | Co-occurrence |
|
|
| 146 | 68 | 41 | 1.209 | < 0.001 | Co-occurrence |
Fig. 4The mutation of different SNP types. a The frequencies of different SNP types in the high and low risk groups. b The Kaplan–Meier curve for using C → G and G → C SNP frequency combine with 118 gene expression as indicators
C-index of the Cox proportional hazards models based on different features
| Feature type | Features | C-index | |
|---|---|---|---|
| Gene expression | 330 significant univariate genes | 0.584 | 0.0011 |
| 118 significant univariate genes with hazard ratio > 1.221 | 0.636 | < 0.0001 | |
| 88 significant univariate genes selected by GA | 0.656 | < 0.0001 | |
| Somatic mutation | 25 functional annotations | 0.603 | 0.0012 |
| 15 gene ontology terms | 0.567 | 0.0013 | |
| 14 pathways | 0.548 | 0.0037 | |
| 54 functional gene sets combining functional annotations, GO terms and pathways | 0.591 | 0.0016 | |
| Gene expression & somatic mutation | All 142 features (88 significant univariate genes and 54 functional gene sets) | 0.658 | < 0.0001 |
Fig. 5The 14 pathways enriched by differential mutated genes. The 14 pathways were significantly enriched by the top 2000 differentially mutated genes with higher MRDS (blue bar). These pathways also showed significance in the Cox univariate regression test (yellow bar)