| Literature DB >> 26549855 |
Jie Sun1, Xihai Chen2, Zhenzhen Wang1, Maoni Guo1, Hongbo Shi1, Xiaojun Wang1, Liang Cheng1, Meng Zhou1.
Abstract
Long non-coding RNAs (lncRNAs) have been implicated in a variety of biological processes, and dysregulated lncRNAs have demonstrated potential roles as biomarkers and therapeutic targets for cancer prognosis and treatment. In this study, by repurposing microarray probes, we analyzed lncRNA expression profiles of 916 breast cancer patients from the Gene Expression Omnibus (GEO). Nine lncRNAs were identified to be significantly associated with metastasis-free survival (MFS) in the training dataset of 254 patients using the Cox proportional hazards regression model. These nine lncRNAs were then combined to form a single prognostic signature for predicting metastatic risk in breast cancer patients that was able to classify patients in the training dataset into high- and low-risk subgroups with significantly different MFSs (median 2.4 years versus 3.0 years, log-rank test p < 0.001). This nine-lncRNA signature was similarly effective for prognosis in a testing dataset and two independent datasets. Further analysis showed that the predictive ability of the signature was independent of clinical variables, including age, ER status, ESR1 status and ERBB2 status. Our results indicated that lncRNA signature could be a useful prognostic marker to predict metastatic risk in breast cancer patients and may improve upon our understanding of the molecular mechanisms underlying breast cancer metastasis.Entities:
Mesh:
Substances:
Year: 2015 PMID: 26549855 PMCID: PMC4637883 DOI: 10.1038/srep16553
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
lncRNAs significantly associated with the MFS of breast cancer patients in the training set (n = 254).
| Gene id | Gene symbol | Chromosome(GRCh38) | P value | Hazard ratio | Cofficient |
|---|---|---|---|---|---|
| ENSG00000271894.1 | RP11-482H16.1 | Chr2:56,147,630–56,386,171(+) | 1.40E-03 | 1.788 | 0.581 |
| ENSG00000242540.2 | AC010729.1 | Chr2:5,696,220–5,708,095(+) | 1.52E-03 | 1.373 | 0.317 |
| ENSG00000257337.4 | RP11-983P16.4 | Chr12:53,014,596–53,054,438(−) | 1.89E-03 | 0.663 | −0.411 |
| ENSG00000230798.3 | FOXD3-AS1 | Chr1:63,320,884–63,324,441(−) | 8.02E-04 | 0.645 | −0.439 |
| ENSG00000231532.3 | LINC01249 | Chr2:4,628,222–4,656,215(−) | 5.77E-04 | 0.643 | −0.441 |
| ENSG00000225057.2 | AC096574.4 | Chr2:238,231,684–238,255,633(+) | 1.09E-03 | 0.643 | −0.441 |
| ENSG00000228363.2 | AC015971.2 | Chr2:86,562,070–86,618,766(+) | 5.82E-04 | 0.633 | −0.456 |
| ENSG00000214184.3 | AC012487.2 | Chr2:108,507,515–108,534,196(−) | 6.76E-04 | 0.625 | −0.470 |
| ENSG00000267191.1 | RP11-15A1.2 | Chr19:43,902,001–43,926,545(+) | 3.40E-04 | 0.565 | −0.570 |
aDerived from the univariable Cox’s proportional-hazards regression analysis in the training set.
Univariate analysis on the lncRNA signature for MFS.
| Variables | HR | 95% CI of HR | P value |
|---|---|---|---|
| Training dataset | |||
| lncRNA risk score (low/high) | 2.993 | 1.728–5.184 | 9.15E-05 |
| Age | 1.021 | 0.996–1.045 | 0.095 |
| ESR1 | 0.486 | 0.294–0.803 | 0.005 |
| ERBB2 | 1.509 | 0.605–3.768 | 0.378 |
| ER | 0.427 | 0.257–0.711 | 0.001 |
| Testing dataset | |||
| lncRNA risk score (low/high) | 2.794 | 1.517–5.148 | 0.001 |
| Age | 0.975 | 0.949–1.002 | 0.066 |
| ESR1 | 0.207 | 0.110–0.392 | 1.27E–06 |
| ERBB2 | 1.598 | 0.633–4.038 | 0.321 |
| ER | 0.271 | 0.150–0.49 | 1.59E-05 |
| Entire GSE25066 dataset | |||
| lncRNA risk score (low/high) | 2.908 | 1.934–4.372 | 2.90E-07 |
| Age | 0.998 | 0.981–1.016 | 0.860 |
| ESR1 | 0.336 | 0.228–0.496 | 3.79E-08 |
| ERBB2 | 1.522 | 0.794–2.916 | 0.206 |
| ER | 0.344 | 0.234–0.507 | 6.07E-08 |
| GSE4922 dataset | |||
| lncRNA risk score (low/high) | 1.584 | 1.043–2.404 | 0.031 |
| Age | 0.997 | 0.982–1.013 | 0.722 |
| ER | 0.858 | 0.467–1.578 | 0.623 |
| GSE1456 dataset | |||
| lncRNA risk score (low/high) | 2.257 | 1.198–4.250 | 0.012 |
Figure 1Establishment and performance evaluation of the nine-lncRNA signature for MFS of breast cancer patients in the training dataset.
(A) The ROC curves for MFS prediction by the nine-lncRNA signature in the training dataset. (B) Kaplan-Meier analysis for MFS of breast cancer patients using the nine-lncRNA signature in the training dataset. (C) The distribution of the metastasis risk score, patients’ metastasis status and lncRNA expression in the training dataset.
Figure 2Performance evaluation of the nine-lncRNA signature for MFS of breast cancer patients in the testing dataset and entire GSE25066 dataset.
(A) Kaplan-Meier curves for patients in the testing dataset (n = 254). (B) Kaplan-Meier curves for patients in the entire GSE25066 dataset (n = 508). The two-sided Log-rank test was performed to test the difference for MFS between the high-risk and low-risk groups. The number of patients at risk was listed below the survival curves. (C) The distribution of the metastasis risk score, patients’ metastasis status and lncRNA expression in the testing dataset. (D) The distribution of the metastasis risk score, patients’ metastasis status and lncRNA expression in the entire GSE25066 dataset.
Figure 3The nine-lncRNA signature-focused risk score in predicting MFS of two independent datasets.
Differences in MFS were assessed between high-risk and low-risk groups for the GSE4922 dataset (n = 249) (A), and the GSE1456 dataset (n = 159) (B). All the p values of Kaplan-Meier analysis were calculated using a two-sided log-rank test. The number of patients at risk was shown below the survival curves. The nine-lncRNA risk score distribution, patients’ metastasis status and heatmap of the nine lncRNA expression profiles were analyzed in the GSE4922 dataset (C) and GSE1456 dataset (D).
Multivariate analysis on the lncRNA signature for MFS.
| Variables | HR | 95% CI of HR | P value |
|---|---|---|---|
| GSE25066 dataset | |||
| lncRNA risk score (low/high) | 1.791 | 1.105–2.904 | 0.018 |
| Age | 1.001 | 0.983–1.020 | 0.883 |
| ESR1 | 0.721 | 0.387–1.344 | 0.304 |
| ERBB2 | 1.981 | 1.013–3.874 | 0.046 |
| ER | 0.529 | 0.295–0.951 | 0.033 |
| GSE4922 dataset | |||
| lncRNA risk score (low/high) | 1.598 | 1.018–2.508 | 0.042 |
| Age | 1.001 | 0.985–1.017 | 0.927 |
| ER | 1.063 | 0.561–2.015 | 0.851 |
Figure 4Kaplan-Meier analysis for MFS of breast cancer patients using the nine-lncRNA signature in the subgroups stratified by ER status.
(A) Kaplan-Meier curves for breast cancer patients with ER-negative status (n = 239). (B) Kaplan-Meier curves for breast cancer patients with ER-positive status (n = 508). The differences between the two curves were access by the two-sided log-rank test. The number of patients at risk was listed below the survival curves.
Figure 5Functional enrichment map of the protein-coding genes co-expressed with prognostic lncRNAs.
The enrichment analysis for protein-coding genes positively correlated with prognostic lncRNAs. Each node represents a GO term and an edge represents existing genes shared between connecting GO terms. Node size represents the number of gene in the GO terms. Color intensity is proportional to enrichment significance. The main functional annotations are marked for each cluster of GO terms.