| Literature DB >> 23383020 |
Qingchao Qiu1, Pengcheng Lu, Yuzhu Xiang, Yu Shyr, Xi Chen, Brian David Lehmann, Daniel Joseph Viox, Alfred L George, Yajun Yi.
Abstract
BACKGROUND: Robust transcriptional signatures in cancer can be identified by data similarity-driven meta-analysis of gene expression profiles. An unbiased data integration and interrogation strategy has not previously been available. METHODS ANDEntities:
Mesh:
Year: 2013 PMID: 23383020 PMCID: PMC3558433 DOI: 10.1371/journal.pone.0054979
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Signature clustering process for identification of BRmet50.
The workflow of iterative EXALT method includes three major processes. (1) Extraction of 633 breast cancer signatures. All paired sample groups within each breast cancer datasets (n = 223) were compared based on all possible clinical and pathologic covariates such as tumor size, nodal involvement, grade, marker status, lymphovascular invasion, relapse, metastasis, p53 status, BRCA1 and BRCA2 mutations. Student’s t-test was then performed for all pairwise comparisons, and a total of 633 breast cancer signatures were generated and uploaded into a database (HuCaSigDB). (2) Signature clusters and classification. Iterative search was carried out using each of 633 signatures as a query (anchored or seed) signature against HuCaSigDB repeatedly to identify homologous signatures with significant data similarity defined by EXALT. 121 out of 633 query signatures found at least one similar signature in HuCaSigDB and formed 121 clusters, while the remaining 512 (singletons) failed to generate clusters. Two typical results are depicted by schematic description labeled with anchored signatures: the singleton Sig21 and the cluster Sig24 including 11 signature members like Sig544, Sig128, Sig140, etc. Knowledge based analysis of signature phenotypes and sizes was performed among 121 signature clusters. Eight clusters had obvious metastasis phenotypes. Of the eight clusters, the largest cluster anchored by the query signature (sig24) was selected for further analysis. (3) Identification of meta-signature BRmet50. All 6,526 signature genes from the 11 signatures of the cluster Sig24 were assembled together to form a synthetic signature (BRmet). The genes within BRmet were ranked based on recurrent frequency and concordance of differential expression represented by a meta-heat map. The top 50 genes (BRmet50) represented in rows were determined by a 100% recurrent frequency and gene expression profile concordance among the 11 signatures represented in columns. The colors in the meta-heat map represent the direction of differential gene expression within a given transcriptional profile (red for up, green for down, and black for a missing match). Color intensity reflects the confidence levels of differential expression.
Members of breast cancer metastatic signature (BRmet50).
| Signature ID | Signature Name | BRid |
| Sig544 | without metastasis vs with metastasis | BR544 |
| Sig2411 | without metastasis vs with metastasis | BR2411 |
| Sig1405 | ER-positive vs ER-negative | BR1405 |
| Sig1128 | grade 1 vs grade 3 | BR1128 |
| Sig1042 | grade 1 vs grade 3 | BR1042 |
| Sig1224r | ER-positive vs ER-negative | BR1224 |
| Sig1552r | normal breast-like vs basal-like | BR1552 |
| Sig1095 | grade 1 vs grade 3 | BR1095 |
| Sig1414 | grade 1 vs grade 3 | BR1414 |
| Sig1141 | grade 1 vs grade 3 | BR1141 |
| Sig907r | normal breast-like vs basal-like | BR907 |
BRid denotes the breast cancer dataset ID sharing the same signature ID number as the respective published study.
Summary of survival analysis p-values in breast cancer.
| Test Data Sets | Endpoints | BRmet50 | BRmet50 Ctr | BRSig70 | BRSig76 |
|
| |||||
| BR544 | DMFS | <0.001 | <0.001 | 0.007 | 0.024 |
| BR2411 | RFS | <0.001 | <0.001 | <0.001 | <0.001 |
| BR1405 | RFS | 0.002 | 0.002 | 0.019 | 0.006 |
| BR1128 | DSS | <0.001 | <0.001 | 0.015 | 0.018 |
| BR1042 | RFS | 0.002 | 0.033 | 0.144 | 0.698 |
| BR1552 | RFS | <0.001 | <0.001 | 0.082 | <0.001 |
| BR1095 | DFS | <0.001 | <0.001 | 0.001 | 0.005 |
| BR1414 | RFS | <0.001 | <0.001 | <0.001 | <0.001 |
| BR1141 | RFS | <0.001 | <0.001 | 0.026 | 0.156 |
| BR18347175 | DMFS | <0.001 | NA | <0.001 | <0.001 |
|
| |||||
| METABRIC discovery | DSS | <0.001 | NA | <0.001 | <0.001 |
| METABRIC validation | DSS | <0.001 | NA | <0.001 | <0.001 |
| GSE2607 | RFS | 0.004 | NA | 0.005 | <0.001 |
| GSE7390 | RFS | 0.028 | NA | 0.516 | 0.063 |
| GSE11121 | DMFS | 0.027 | NA | 0.012 | 0.183 |
| GSE17705 | DMFS | 0.045 | NA | 0.043 | 0.574 |
| GSE20624 | RFS | 0.001 | NA | 0.037 | 0.037 |
| GSE20685 | OS | <0.001 | NA | 0.002 | <0.001 |
| GSE21653 | DFS | 0.014 | NA | 0.121 | 0.396 |
| GSE25055 | DMFS | <0.001 | NA | <0.001 | <0.001 |
| GSE25065 | DMFS | <0.001 | NA | <0.001 | <0.001 |
Endpoints: Clinic endpoints are distant metastases-free survival (DMFS),relapse-free survival (RFS), disease-free survival (DFS), disease-specific survival (DSS), Overall Survival (OS).
BRmet50 Ctr: control signatures are isoform signatures of BRmet50 assembled by the leave-one-out method in which the corresponding breast cancer dataset is excluded intentionally.
NA: not available.
Summary of survival analysis p-values and c-indexes in breast cancer.
| BRmet50 | BRsig70 | BRsig76 | ONCO | TAMR13 | PAM50 | Genius | PIK3 | GGI | |
|
| |||||||||
|
| <0.001 | <0.001 | <0.001 | <0.001 | <0.001 | <0.001 | <0.001 | 0.001 | <0.001 |
|
| <0.001 | <0.001 | <0.001 | <0.001 | <0.001 | <0.001 | <0.001 | 0.002 | <0.001 |
|
| 0.004 | 0.005 | <0.001 | 0.01 | 0.814 | 0.078 | 0.814 | 0.814 | 0.814 |
|
| 0.028 | 0.516 | 0.063 | 0.368 | 0.238 | 0.223 | 0.013 | 0.911 | 0.015 |
|
| 0.027 | 0.012 | 0.183 | 0.002 | <0.001 | 0.004 | 0.003 | 0.012 | 0.122 |
|
| 0.045 | 0.043 | 0.574 | 0.064 | 0.002 | 0.015 | 0.858 | 0.677 | 0.137 |
|
| 0.001 | 0.037 | 0.037 | 0.003 | 0.037 | 0.037 | 0.037 | 0.037 | 0.037 |
|
| <0.001 | 0.002 | <0.001 | <0.001 | 0.023 | 0.006 | 0.016 | 0.018 | 0.003 |
|
| 0.014 | 0.121 | 0.396 | 0.001 | 0.007 | 0.123 | 0.027 | 0.11 | 0.06 |
|
| <0.001 | <0.001 | <0.001 | <0.001 | 0.001 | <0.001 | <0.001 | <0.001 | <0.001 |
|
| <0.001 | <0.001 | <0.001 | <0.001 | 0.89 | <0.001 | 0.597 | <0.001 | 0.082 |
|
| |||||||||
|
| 0.6182 | 0.6125 | 0.5969 | 0.6379 | 0.5961 | 0.6159 | 0.6015 | 0.5726 | 0.6279 |
|
| 0.6004 | 0.5905 | 0.5860 | 0.6142 | 0.5724 | 0.5838 | 0.5679 | 0.5638 | 0.6069 |
|
|
|
|
|
| 0.5039 | 0.6342 | 0.5039 | 0.5039 | 0.5039 |
|
|
| 0.5469 | 0.5795 | 0.5524 | 0.5604 | 0.5578 |
| 0.4745 |
|
|
|
|
| 0.5723 |
|
|
|
|
| 0.586 |
|
|
|
| 0.5232 | 0.5724 |
|
| 0.5112 | 0.5185 | 0.5543 |
|
| 0.5082 | 0.5063 | 0.5063 | 0.5989 | 0.5063 | 0.5063 | 0.5063 | 0.5063 | 0.5063 |
|
| 0.6064 | 0.5978 | 0.6213 | 0.6006 | 0.5762 | 0.5885 | 0.5798 | 0.5839 | 0.5989 |
|
|
| 0.542 | 0.5209 |
|
| 0.5604 |
| 0.568 | 0.5528 |
|
| 0.6384 | 0.6524 | 0.6301 | 0.6440 | 0.6022 | 0.6604 | 0.6126 | 0.6830 | 0.6472 |
|
|
|
|
|
| 0.5158 |
| 0.5358 |
| 0.5766 |
Note: METABRIC D and METABRIC V are discovery and validation datasets from METABRIC study [91], and Other datasets represented by GSE ID are available from NCBI GEO database.
There are eight published signatures in the study including BRsig70 [3], BRsig76) [2], ONCO (Oncotype DX) [4], [5], TAMR13 [6], PAM50 [9], Genius [7], PIK3(PIK3CAGS278) [10], and GGI [8].
Figure 2Kaplan-Meier analyses for relapse-free survival.
Data from 108 tumors from the dataset BR1042 were stratified into two groups by BRsig70 and BRsig76 (bottom panels), the control signature (BRmet[-1042]) from the leave-one-out method, or BRmet50 (upper panels) gene expression profiles. In each survival plot, two types of relapse-free survival were compared: a poor prognosis group (black dashed line) and a good prognosis group (red solid line). The relapse-free time in days is displayed on the x-axis, and the y-axis shows the probability of relapse-free survival. The p-values indicate the statistical significance of survival time differences between the two groups.
Hazard ratio risks for cancer relapse and log-rank tests in BR1141.
| Clinicopathologic | BRmet50 | BRsig70 | BRsig76 | |||
| parameters | HR (95% CI) | HR P | HR (95% CI) | HR P | HR (95% CI) | HR P |
|
| ||||||
|
| 2.6 (1.3–5.5) | 0.009 | 1.5 (0.6–3.7) | 0.386 | 1.0 (0.5–2.1) | 0.942 |
|
| 1.7 (1.0–2.8) | 0.044 | 1.8 (0.9–3.8) | 0.113 | 0.7 (0.4–1.2) | 0.209 |
|
| ||||||
|
| 2.3 (1.4–3.9) | 0.001 | 1.6 (0.8–3.0) | 0.193 | 0.8 (0.5–1.4) | 0.511 |
|
| 2.0 (1.0–4.1) | 0.053 | 2.8 (0.8–9.3) | 0.089 | 0.6 (0.3–1.4) | 0.245 |
|
| ||||||
|
| 2.6 (1.4–5.0) | 0.004 | 2.1 (1.0–4.6) | 0.063 | 1.1 (0.5–2.0) | 0.869 |
|
| 2.2 (1.2–3.9) | 0.007 | 1.7 (0.7–4.0) | 0.230 | 0.6 (0.3–1.0) | 0.041 |
|
| ||||||
|
| 2.3 (0.6–8.4) | 0.196 | 2.4 (0.8–7.2) | 0.121 | 1.3 (0.4–3.8) | 0.682 |
|
| 2.5 (1.5–4.3) | 0.001 | 1.6 (0.8–3.4) | 0.219 | 0.7 (0.4–1.2) | 0.194 |
|
| 1.4 (0.6–3.4) | 0.442 | 0.2 (0–1.8) | 0.172 | 0.5 (0.2–1.1) | 0.086 |
|
| ||||||
|
| 1.4 (0.5–4.0) | 0.495 | 2.2(0.3–16.3) | 0.456 | 0.9 (0.3–2.3) | 0.782 |
|
| 2.5 (1.6–4.0) | <0.001 | 1.8 (1.0–3.3) | 0.050 | 0.7 (0.4–1.1) | 0.103 |
The 269 patients with breast cancer included in the BR1141 dataset were stratified according to tumor size, lymph-node status, tamoxifen treatment, histological grade, and ER status. A univariate Cox proportional-hazards model was used to evaluate the association of individual signatures (i.e., the BRmet50, BRsig70, or BRsig76) with the clinical outcome in each category.
T1 denotes a tumor with size less than or equal to 2.0 cm, and T2 denotes a tumor with size larger than 2.0 cm. HR (95% CI): hazard ratio value (95% confidence interval). HR P: hazard ratio p-value.
Multivariate analysis of disease risk among patients with breast cancer.
| BRmet50 | BRsig70 | BRsig76 | ||||
| Datasets | HR (95% CI) | HR P | HR (95% CI) | HR P | HR (95% CI) | HR P |
|
| 3.1 (1.4–7.0) | <0.01 | 1.7 (0.7–3.9) | 0.23 | 0.8 (0.4–1.7) | 0.54 |
|
| 1.8 (1.1–2.9) | 0.02 | 1.5 (0.9–2.5) | 0.16 | 1.3 (0.8–2.2) | 0.26 |
|
| 2.0 (1.0–3.9) | 0.03 | 1.4 (0.8–2.7) | 0.27 | 1.2 (0.6–2.3) | 0.49 |
|
| 2.3 (1.4–3.6) | <0.01 | 1.6 (0.9–2.9) | 0.13 | 0.6 (0.4–1.0) | 0.05 |
|
| 2.5 (1.4–5.0) | <0.01 | 1.1 (0.6–2.0) | 0.76 | 2.0 (1.1–3.3) | 0.03 |
The HR and p-values for each signature were adjusted by age, grade, tumor size, LN, ER, and NPI.
Age and tumor diameter were modeled as continuous variables; the hazard ratio is for each increase of 1 cm. in diameter or for each 1-year increase in age. HR: hazard ratio with 95% confidence interval; HR P: hazard ratio p-value.
Univariate and multivariate analysis in lung, prostate, and colon cancer.
| BRmet50 | BRsig70 | BRsig76 | |||||
|
|
|
|
|
|
|
|
|
|
|
| 1.7 (1.3–2.4) |
| 1.2 (0.8–1.9) | 0.37 | 1.2 (0.8–1.9) | 0.37 |
|
| 1.8 (1.2–2.5) |
| 1.2 (0.7–1.9) | 0.51 | 1.2 (0.7–1.9) | 0.51 | |
|
|
| 0.4 (0.3–0.6) |
| 1.2 (0.8–1.8) | 0.34 | 0.7 (0.5–1.0) | 0.07 |
|
| 0.6 (0.3–1.0) |
| 1.3 (0.8–2.1) | 0.27 | 0.9 (0.5–1.4) | 0.52 | |
|
|
| 1.3 (0.4–4.7) |
| 0.5 (0.1–1.9) | 0.32 | 0.5 (0.2–1.9) | 0.32 |
|
| 1.4 (0.4–5.0) |
| 0.5 (0.1–1.8) | 0.31 | 0.5 (0.1–1.8) | 0.30 | |
Adjusted factors in lung cancer: age, gender, chemotherapy treatment, radiation treatment, smoking habits, and tumor stage.
Adjusted factors in prostate cancer: age, tumor stage, ploidy, and PSA relapse.
Adjusted factor in colon cancer: age.