| Literature DB >> 26490514 |
Hui Xu1, Xin Guo1, Qiang Sun2, Mengmeng Zhang1, Lishuang Qi1, Yang Li1, Libin Chen1, Yunyan Gu1, Zheng Guo1,3, Wenyuan Zhao1.
Abstract
Cancer tissue sampling affects the identification of cancer characteristics. We aimed to clarify the source of differentially expressed genes (DEGs) in macro-dissected cancer tissue and develop a robust prognostic signature against the effects of tissue sampling. For estrogen receptor (ER)+ breast cancer patients, we identified DEGs in macro-dissected cancer tissues, malignant epithelial cells and stromal cells, defined as Macro-Dissected-DEGs, Epithelial-DEGs and Stromal-DEGs, respectively. Comparing Epithelial-DEGs to Stromal-DEGs (false discovery rate (FDR) < 10%), 86% of the overlapping genes exhibited consistent dysregulation (defined as Consistent-DEGs), and the other 14% of genes were dysregulated inconsistently (defined as Inconsistent-DEGs). The consistency score of dysregulation directions between Macro-Dissected-DEGs and Consistent-DEGs was 91% (P-value < 2.2 × 10(-16), binomial test), whereas the score was only 52% between Macro-Dissected-DEGs and Inconsistent-DEGs (P-value = 0.9, binomial test). Among the gene ontology (GO) terms significantly enriched in Macro-Dissected-DEGs (FDR < 10%), 18 immune-related terms were enriched in Inconsistent-DEGs. DEGs associated with proliferation could reflect common changes of malignant epithelial and stromal cells; DEGs associated with immune functions are sensitive to the percentage of malignant epithelial cells in macro-dissected tissues. A prognostic signature which was insensitive to the cellular composition of macro-dissected tissues was developed and validated for ER+ breast patients.Entities:
Mesh:
Substances:
Year: 2015 PMID: 26490514 PMCID: PMC4614546 DOI: 10.1038/srep15474
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Summary of the ten datasets analyzed in this study.
| Dataset | Sample size | EndPoint | GEO accession ID | Platform | |
|---|---|---|---|---|---|
| Cancer | Normal | ||||
| Lcm-Data1 | 30 | 22 | — | GSE14548 | U133_X3P |
| Lcm-Data2 | 30 | 10 | — | GSE10797 | HG-U133A_2 |
| M-data1 | 28 | 34 | — | GSE10780 | HG-U133_Plus_2 |
| M-data2 | 19 | 27 | — | GSE10810 | HG-U133_Plus_2 |
| M-data3 | 67 | 17 | — | GSE42568 | HG-U133_Plus_2 |
| TCGA-Data | 376 | 55 | — | — | AgilentG4502A_07 |
| Sur-Data1 | 134 | — | RFS | GSE7390 | HG-U133A |
| Sur-Data2 | 85 | — | RFS | GSE6532 | HG-U133A |
| Sur-Data3 | 209 | — | RFS | GSE2034 | HG-U133A |
| Sur-Data4 | 119 | — | DFS | GSE4922 | HG-U133A |
Note: Lcm-Data indicates the laser capture microdissection datasets; M-data indicates macro-dissected breast cancer datasets; Sur-Data indicates breast cancer survival datasets; RFS and DFS indicate relapse free survival and disease-free survival, respectively. These datasets were produced by different platforms, including the U133_X3P, HG-U133A_2, HG-U133_Plus_2, AgilentG4502A_07 and HG-U133A platforms, which detected 19703, 12790, 20283, 15621 and 12752 genes, respectively.
The reproducibility of Consistent-Gene-Pairs.
| Dataset | Tis-type | Dis | Nor | Pair-num | Pair-0.05 | Direc-con |
|---|---|---|---|---|---|---|
| Lcm-Data2 | Lcm_epi | 15 | 5 | 31271 | 6372 | 6353 (99.70%) |
| Lcm-Data2 | Lcm_str | 15 | 5 | 31271 | 5762 | 5748(99.76%) |
| M-Data1 | Tis | 28 | 34 | 50958 | 36518 | 36496(99.94%) |
| M-Data2 | Tis | 19 | 27 | 50958 | 29389 | 29272(99.60%) |
| M-Data3 | Tis | 67 | 17 | 50958 | 17204 | 16547(96.18%) |
| TCGA-low | Tis | 225 | 55 | 40025 | 30806 | 30438(98.81%) |
| TCGA-high | Tis | 151 | 55 | 40025 | 33114 | 32790(99.02%) |
Note: Lcm and Tis indicate laser capture microdissection datasets and macro-dissected datasets, respectively; Dis and Nor indicate cancer samples and normal controls, respectively; Pair-num indicates the number of gene pairs detected in the datasets; Pair-0.05 indicates the number of gene pairs with a tendency for reversion (P-value < 0.05); Direc-con indicates the number and proportion of gene pairs that are reversed consistently in Pair-0.05 and Com-p; TCGA-low indicates low-tumor purity samples in The Cancer Genome Atlas; TCGA-high indicates high-tumor purity samples in The Cancer Genome Atlas.
The prognostic gene pairs.
| Gene A | Gene B | COX.β | COX.p |
|---|---|---|---|
| CAMLG | KIAA0101 | 5.0287 | 7.23E–07 |
| CRIP1 | ING1 | 3.4221 | 1.17E–05 |
| CRYAB | MAP4K5 | 3.5943 | 6.59E–06 |
| CSRP1 | RAI14 | 3.8159 | 1.65E–06 |
| FBL | PSMD2 | 0.9705 | 1.61E–05 |
| FBL | HN1 | 1.2282 | 2.20E–05 |
| LMCD1 | FGFR4 | 2.7742 | 9.62E–06 |
| HOXA4 | MAP4K5 | 2.5518 | 2.47E–06 |
| SERPINB5 | NCAPG | 1.1088 | 1.59E–06 |
| PIGR | KIF4A | 0.9484 | 1.29E–05 |
| PIK3R1 | PRC1 | 0.8993 | 3.20E–05 |
| SOX10 | KIF4A | 1.0677 | 7.40E–07 |
| LMBRD1 | KIAA0101 | 2.2254 | 2.32E–05 |
| SAV1 | KIF4A | 1.1495 | 2.48E–05 |
| OGFRL1 | KIF4A | 1.1011 | 3.72E–07 |
| EVL | GPR125 | 3.5943 | 6.59E–06 |
| OGFRL1 | HJURP | 0.8993 | 3.18E–05 |
Note: The univariate Cox proportional hazards model was used to estimate the risk coefficient of relative ordering (R > R) for each gene pair and the correlation with overall survival in patients; the C-index represents the prognostic performance of relative ordering (R > R) for each gene pair.
Figure 1Kaplan–Meier curves illustrating relapse-free survival among patients with ER+ breast cancer based on Prognostic gene pairs in the training set.
Univariate and multivariate Cox regression analysis of the association with RFS.
| Characteristic | Subcategory | Univariate analysis | Multivariate analysis | ||
|---|---|---|---|---|---|
| HR (95% CI) | P-value | HR (95% CI) | P-value | ||
| Training cohort | |||||
| Prognostic gene pairs | High vs. low | 3.90(2.54–5.98) | 4.12E–10 | 3.60(2.24–5.76) | 9.61E–08 |
| Age | >49 vs. ≤49 | 0.87(0.57–1.32) | 0.52 | 0.95(0.60–1.50) | 0.86 |
| Grade | I vs. II, III | 1.81(1.01–3.23) | 0.04 | 1.19(0.65–2.18) | 0.57 |
| Size | >2 cm vs. ≤2 cm | 1.97(1.29–3.00) | 1.51E–03 | 1.74(1.10–2.74) | 0.02 |
| Validation cohort | |||||
| Prognostic gene pairs | High vs. low | 2.45(1.11–5.43) | 0.03 | 2.19(0.95–5.06) | 0.06 |
| Age | >66 vs. ≤66 | 0.7(0.35–1.38) | 0.31 | 0.74(0.37–1.49) | 0.41 |
| Grade | I vs. II, III | 1.41(0.70–2.86) | 0.33 | 1.12(0.53–2.36) | 0.76 |
| Size | >18 mm vs. ≤18 mm | 1.78(0.91–3.50) | 0.09 | 1.68(0.85–3.34) | 0.14 |
Figure 2Kaplan–Meier curves illustrating relapse-free survival among patients with ER+ breast cancer based on Prognostic gene pairs in the test sets.
(A) Test set 1 consisted of Sur-Data3; (B) Test set 2 consisted of Sur-Data4.