| Literature DB >> 33193655 |
Na Li1,2, Hao Cai3, Kai Song4, You Guo3, Qirui Liang2, Jiahui Zhang2, Rou Chen5, Jing Li2, Xianlong Wang2, Zheng Guo2.
Abstract
About 20-30% of early-stage breast cancer patients suffer relapses after surgery. To identify such high-risk patients, many signatures have been reported, but they lack robustness in data measured on different platforms. Here, we developed a signature which is robust across multiple profiling platforms, and identified reproducible omics features characterizing metastasis of estrogen receptor (ER)-positive breast cancer from the Gene Expression Omnibus database with the aid of the signature. Based on the stable within-sample relative expression orderings (REOs), we constructed a signature consisting of five gene pairs, named 5-GPS, whose REOs were significantly correlated with relapse-free survival using the univariate Cox regression model. Using 5-GPS, patients were classified into the low-risk and high-risk groups. Patients in the high-risk group have worse survival compared to those in the low-risk group using Kaplan-Meier curve analysis with the log-rank test. Applying 5-GPS to the RNA-sequencing data of stage I-IV breast cancer samples archived in The Cancer Genome Atlas (TCGA), we found that the proportion of the high-risk patients increases with the stage. The proposed REO-based signature shows potential in identifying early-stage ER+ breast cancer patients with high risk of relapse after surgery.Entities:
Keywords: ER+ breast cancer; gene expression; micro-metastasis; prognosis signature; relapse risk
Year: 2020 PMID: 33193655 PMCID: PMC7658391 DOI: 10.3389/fgene.2020.566928
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
Description of ER+ breast cancer tissue datasets used in this study.
| Discovery cohort | Significantly stable gene pairs | GSE19615 | Affymetrix GPL570 | 21 | 19 |
| GSE43365 | Affymetrix GPL570 | 6 | 52 | ||
| GSE31448 | Affymetrix GPL570 | 78 | 30 | ||
| EGAS00000000083 | Illumina GPL6947 | 99 | 184 | ||
| Dataset | Platform | LN- | |||
| Prognosis gene pairs | GSE7390 | Affymetrix GPL96 | / | 134 | |
| GSE6532 | Affymetrix GPL96 | / | 85 | ||
| Validation cohort | Validation | GSE2034 | Affymetrix GPL96 | / | 209 |
| GSE4922 | Affymetrix GPL96 | / | 116 | ||
| TCGA data | Stage I | Stage II | Stage III | Stage IV | |
| 126 | 350 | 142 | 15 | ||
FIGURE 1Flowchart of the processes for developing and validating the prognosis signature.
The genes information of 5-GPS.
| 10902 | BRD8 | bromodomain containing 8 | 10112 | KIF20A | kinesin family member 20A |
| 51257 | MARCH2 | membrane associated ring-CH-type finger 2 | 3221 | HOXC4 | homeobox C4 |
| 9014 | TAF1B | TATA-box binding protein associated factor, RNA polymerase I subunit B | 9518 | GDF15 | growth differentiation factor 15 |
| 123872 | DNAAF1 | dynein axonemal assembly factor 1 | 55176 | SEC61A2 | SEC61 translocon subunit alpha 2 |
| 64421 | DCLRE1C | DNA cross-link repair 1C | 10024 | TROAP | trophinin associated protein |
FIGURE 2The predictive performance of the 5-GPS signature. The Kaplan-Meier curves of RFS for the early-stage breast cancer patients accepting surgery only in (A) the discovery cohort and (C,E) the validation cohorts. (G) The proportion of high-risk samples in I-IV stage. (B,D,F) The ROC curves for 5-GPS.
Univariate and multivariate Cox regression analysis.
| 5-GPS | 4.89 (3.05–7.86) | 4.99E-11 | 5.00(3.16–7.88) | 3.09E-14 |
| Age (>55 vs. < 55) | 0.99(0.98–1.00) | 0.16 | 1.00(0.99–1.01) | 0.53 |
| Grade (3 vs. 2 vs. 1) | 0.85(0.57–1.27) | 0.43 | 1.24(0.85–1.80) | 0.27 |
| Size (>2 vs. < 2 cm) | 1.35 (1.00–1.83) | 0.05 | 1.22(0.96–1.55) | 0.10 |
| 5-GPS | 1.98 (0.96–4.08) | 0.06 | 2.08(1.04–4.19) | 0.04 |
| Age (>55 vs. < 55) | 0.90(0.42–1.92) | 0.78 | 0.97(0.46–2.04) | 0.93 |
| Grade (3 vs. 2 vs. 1) | 1.61 (0.91–2.84) | 0.10 | 1.73(1.00–3.01) | 0.05 |
| Size (>2 vs. < 2 cm) | 2.20(1.08–4.47) | 0.03 | 2. 70(1.36–5.36) | 0.003 |
FIGURE 3Genomic characteristics of the high- and low-risk groups. (A) The copy region frequencies of high- and low-risk group. (B) The mutation frequencies of high- and low-risk group.