| Literature DB >> 26550991 |
Min Zhang1, Shaojun Zhang1, Yanhua Wen1, Yihan Wang1, Yanjun Wei1, Hongbo Liu1, Dongwei Zhang2, Jianzhong Su1, Fang Wang1, Yan Zhang1.
Abstract
Breast cancer has various molecular subtypes and displays high heterogeneity. Aberrant DNA methylation is involved in tumor origin, development and progression. Moreover, distinct DNA methylation patterns are associated with specific breast cancer subtypes. We explored DNA methylation patterns in association with gene expression to assess their impact on the prognosis of breast cancer based on Infinium 450K arrays (training set) from The Cancer Genome Atlas (TCGA). The DNA methylation patterns of 12 featured genes that had a high correlation with gene expression were identified through univariate and multivariable Cox proportional hazards models and used to define the methylation risk score (MRS). An improved ability to distinguish the power of the DNA methylation pattern from the 12 featured genes (p = 0.00103) was observed compared with the average methylation levels (p = 0.956) or gene expression (p = 0.909). Furthermore, MRS provided a good prognostic value for breast cancers even when the patients had the same receptor status. We found that ER-, PR- or Her2- samples with high-MRS had the worst 5-year survival rate and overall survival time. An independent test set including 28 patients with death as an outcome was used to test the validity of the MRS of the 12 featured genes; this analysis obtained a prognostic value equivalent to the training set. The predict power was validated through two independent datasets from the GEO database. The DNA methylation pattern is a powerful predictor of breast cancer survival, and can predict outcomes of the same breast cancer molecular subtypes.Entities:
Mesh:
Substances:
Year: 2015 PMID: 26550991 PMCID: PMC4638352 DOI: 10.1371/journal.pone.0142279
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Multivariate Cox proportional hazard model of risk gene set.
| Gene | Coef | HR |
| 95% CI |
|---|---|---|---|---|
|
| -54.0 | 3.60×10−24 | 0.002025 | [4.67×10−39, 2.78×10−9] |
|
| -45.9 | 1.23×10−20 | 0.002271 |
|
|
| -42.5 | 3.35×10−19 | 0.008188 |
|
|
| -41.2 | 1.24×10−18 | 0.000633 |
|
|
| -40.6 | 2.45×10−18 | 0.000455 |
|
|
| -34.2 | 1.36×10−15 | 0.001741 |
|
|
| -29.2 | 2.06×10−13 | 0.002975 |
|
|
| 27.9 | 1.32×1012 | 0.009686 |
|
|
| 30.5 | 1.83×1013 | 0.00128 |
|
|
| 51.4 | 2.03×1022 | 0.000563 |
|
|
| 51.6 | 2.52×1022 | 0.008259 |
|
|
| 53.3 | 1.44×1023 | 0.004414 |
|
Abbreviations: Coef is the Cox proportional hazard model regression coefficient. HR: hazard ratio; CI: confidence interval; p value: cox regression model p value.
Fig 1Kaplan-Meier survival analysis of overall survival of 209 breast patients based on feature genes.
(A) MRS. (B) Average DNA methylation levels. (C) Average gene expression levels.
Fig 2Kaplan-Meier survival analysis of overall survival for receptor status.
(A) Survival comparison between ER+ and ER- patients. (B) Survival comparison between PR+ and PR- patients. (C) Survival comparison between Her2+ and Her2- patients. (D) Survival comparison through combination of ER states and MRS. (E) Survival comparison through combination of PR states and MRS. (F) Survival comparison through combination of Her2 states and MRS. (G) Survival comparison between high-MRS and low-MRS from ER+. (H) Survival comparison between high-MRS and low-MRS from PR+. (I) Survival comparison between high-MRS and low-MRS groups from Her2+. (J) Survival comparison between high-MRS and low-MRS groups from ER-. (K) Survival comparison between high-MRS and low-MRS groups from PR-. (L) Survival comparison between high-MRS and low-MRS groups from Her2-.
Fig 3Kaplan-Meier survival analysis of overall survival on 28 patients with death outcome.
(A) Survival comparison between High-MRS and Low-MRS groups. (B) Survival comparison among ER+/- patients. (C) Survival comparison among PR+/- patients. (D) Survival comparison among Her2+/- patients.
Fig 4The Kaplan-Meier survival analysis of overall survival on four independent dataset from GEO database.
(A) GSE37754 from 450K arrays. (B) GSE20712 from 27K arrays.