| Literature DB >> 33344500 |
Qingzhou Guan1, Xuekun Song2, Zhenzhen Zhang1, Yizhi Zhang3, Yating Chen3, Jing Li3.
Abstract
Breast cancer cell lines are frequently used to elucidate the molecular mechanisms of the disease. However, a large proportion of cell lines are affected by problems such as mislabeling and cross-contamination. Therefore, it is of great clinical significance to select optimal breast cancer cell lines models. Using tamoxifen survival-related genes from breast cancer tissues as the gold standard, we selected the optimal cell line model to represent the characteristics of clinical tissue samples. Moreover, using relative expression orderings of gene pairs, we developed a gene pair signature that could predict tamoxifen therapy outcomes. Based on 235 consistently identified survival-related genes from datasets GSE17705 and GSE6532, we found that only the differentially expressed genes (DEGs) from the cell line dataset GSE26459 were significantly reproducible in tissue samples (binomial test, p = 2.13E-07). Finally, using the consistent DEGs from cell line dataset GSE26459 and tissue samples, we used the transcriptional qualitative feature to develop a two-gene pair (TOP2A, SLC7A5; NMU, PDSS1) for predicting clinical tamoxifen resistance in the training data (logrank p = 1.98E-07); this signature was verified using an independent dataset (logrank p = 0.009909). Our results indicate that the cell line model from dataset GSE26459 provides a good representation of the characteristics of clinical tissue samples; thus, it will be a good choice for the selection of drug-resistant and drug-sensitive breast cancer cell lines in the future. Moreover, our signature could predict tamoxifen treatment outcomes in breast cancer patients.Entities:
Keywords: breast cancer; cell line; resistant; sensitive; tamoxifen
Year: 2020 PMID: 33344500 PMCID: PMC7746845 DOI: 10.3389/fmolb.2020.564005
Source DB: PubMed Journal: Front Mol Biosci ISSN: 2296-889X
Data used in this study.
| Tissue | |||||
| GEO Acc | Platform | ER+ Sample | Endpoint | ||
| GSE17705 | Affymetrix GPL96 | 298 | RFS | ||
| GSE6532 | Affymetrix GPL96 | 176 | RFS | ||
| GSE12093 | Affymetrix GPL96 | 136 | RFS | ||
| GSE4922 | Affymetrix GPL96 | 66 | RFS | ||
| GSE2990 | Affymetrix GPL96 | 54 | RFS | ||
| GSE42568 | Affymetrix GPL570 | 67 | RFS | ||
| GSE9195 | Affymetrix GPL570 | 77 | RFS | ||
| GSE27473 | Affymetrix GPL570 | MCF7 | MCF7 silenced ER | 3:3 | RNA silencing |
| GSE12708 | Affymetrix GPL96 | SUM44 | SUM44/LCCTam | 3:3 | Drug pressure |
| GSE26459 | Affymetrix GPL570 | B7 | G11OH-T | 3:3 | MCF7 subclones |
| GSE8562 | Affymetrix GPL96 | MCF7 | MCF7/XBP1 | 3:3 | XBP1 overexpression |
| GSE14986 | Affymetrix GPL570 | MCF7 | T8, T17, T29, T52 | 4:3 | Drug pressure |
| GSE21618 | Affymetrix GPL570 | WT | tamR | 20:11 | Drug pressure |
| GSE67916 | Affymetrix GPL570 | MCF7 | MCF-7/TAMR | 10:8 | Drug pressure |
| #GSE118713 | Illumina GPL16791 | MCF7 | MCF-7/TAMR | 3:3 | Drug pressure |
| #GSE125738 | HiSeq GPL20795 | T47D | T47D-TR | 3:3 | Drug pressure |
FIGURE 1Flowchart of the analysis procedure.
Consistency evaluation of DEGs from different cell line datasets.
| GEO Acc | Cell line* | Def_gene | Com_gene | Con_gene | Ratio | |
| GSE27473 | si-ER MCF7: MCF7 | 15937 | 10795 | 6147 | 0.5694 | <1.00E-16 |
| GSE14986 | T8/17/29/52: MCF7 | 13391 | ||||
| GSE27473 | si-ER MCF7: MCF7 | 15937 | 12580 | 7427 | 0.5904 | <1.00E-16 |
| GSE21618 | TamR: WT | 15481 | ||||
| GSE27473 | si-ER MCF7: MCF7 | 15937 | 9675 | 5424 | 0.5606 | <1.00E-16 |
| GSE67916 | MCF-7/TAMR:MCF-7 | 12227 | ||||
| GSE27473 | si-ER MCF7: MCF7 | 15937 | 8074 | 4450 | 0.5512 | <1.00E-16 |
| GSE118713 | MCF-7/TAMR:MCF-7 | 10023 | ||||
| GSE14986 | T8/17/29/52: MCF7 | 13391 | 10494 | 7391 | 0.7043 | <1.00E-16 |
| GSE21618 | TamR: WT | 15481 | ||||
| GSE14986 | T8/17/29/52: MCF7 | 13391 | 8125 | 5396 | 0.6641 | <1.00E-16 |
| GSE67916 | MCF-7/TAMR:MCF-7 | 12227 | ||||
| GSE14986 | T8/17/29/52: MCF7 | 13391 | 6534 | 4139 | 0.6335 | <1.00E-16 |
| GSE118713 | MCF-7/TAMR:MCF-7 | 10023 | ||||
| GSE14986 | T8/17/29/52: MCF7 | 13391 | 6505 | 4042 | 0.6214 | <1.00E-16 |
| GSE125738 | T47D-TR:T47D | 10685 | ||||
| GSE21618 | TamR: WT | 15481 | 9331 | 5386 | 0.5772 | <1.00E-16 |
| GSE67916 | MCF-7/TAMR:MCF-7 | 12227 | ||||
| GSE26459 | G11OH-T: B7 | 6375 | 5525 | 3192 | 0.5777 | <1.00E-16 |
| GSE27473 | si-ER MCF7: MCF7 | 15937 | ||||
| GSE21618 | TamR: WT | 15481 | 7729 | 4189 | 0.5420 | 8.22E-14 |
| GSE118713 | MCF-7/TAMR:MCF-7 | 10023 | ||||
| GSE118713 | MCF-7/TAMR:MCF-7 | 10023 | 5808 | 3161 | 0.5442 | 8.16E-12 |
| GSE125738 | T47D-TR:T47D | 10685 | ||||
| GSE21618 | TamR: WT | 15481 | 7597 | 4061 | 0.5346 | 9.04E-10 |
| GSE125738 | T47D-TR:T47D | 10685 | ||||
| GSE67916 | MCF-7/TAMR:MCF-7 | 12227 | 5824 | 3212 | 0.5515 | 2.00E-15 |
| GSE118713 | MCF-7/TAMR:MCF-7 | 10023 | ||||
| GSE26459 | G11OH-T: B7 | 6375 | 3767 | 2044 | 0.5426 | 9.10E-08 |
| GSE118713 | MCF-7/TAMR:MCF-7 | 10023 | ||||
| GSE27473 | si-ER MCF7: MCF7 | 15937 | 7991 | 4163 | 0.5210 | 9.32E-05 |
| GSE125738 | T47D-TR:T47D | 10685 | ||||
| GSE26459 | G11OH-T: B7 | 6375 | 1163 | 521 | 0.4480 | 1.00E + 00 |
| GSE12708 | SUM44/LCCTam: SUM44 | 2538 | ||||
| GSE26459 | G11OH-T: B7 | 6375 | 52 | 21 | 0.4038 | 9.37E-01 |
| GSE8562 | MCF7/XBP1: MCF7 | 97 | ||||
| GSE26459 | G11OH-T: B7 | 6375 | 4623 | 2084 | 0.4508 | 1.00E + 00 |
| GSE14986 | T8/17/29/52: MCF7 | 13391 | ||||
| GSE26459 | G11OH-T: B7 | 6375 | 5262 | 2643 | 0.5023 | 3.76E-01 |
| GSE21618 | TamR: WT | 15481 | ||||
| GSE26459 | G11OH-T: B7 | 6375 | 4090 | 1946 | 0.4758 | 9.99E-01 |
| GSE67916 | MCF-7/TAMR:MCF-7 | 12227 | ||||
| GSE26459 | G11OH-T: B7 | 6375 | 3750 | 1321 | 0.3523 | 1.00E + 00 |
| GSE125738 | T47D-TR:T47D | 10685 | ||||
| GSE27473 | si-ER MCF7: MCF7 | 15937 | 2264 | 1056 | 0.4664 | 9.99E-01 |
| GSE12708 | SUM44/LCCTam: SUM44 | 2538 | ||||
| GSE27473 | si-ER MCF7: MCF7 | 15937 | 89 | 33 | 0.3708 | 9.95E-01 |
| GSE8562 | MCF7/XBP1: MCF7 | 97 | ||||
| GSE12708 | SUM44/LCCTam: SUM44 | 2538 | 23 | 12 | 0.5217 | 5.00E-01 |
| GSE8562 | MCF7/XBP1: MCF7 | 97 | ||||
| GSE12708 | SUM44/LCCTam: SUM44 | 2538 | 1885 | 702 | 0.3724 | 1.00E + 00 |
| GSE14986 | T8/17/29/52: MCF7 | 13391 | ||||
| GSE12708 | SUM44/LCCTam: SUM44 | 2538 | 2134 | 920 | 0.4311 | 1.00E + 00 |
| GSE21618 | TamR: WT | 15481 | ||||
| GSE12708 | SUM44/LCCTam: SUM44 | 2538 | 1676 | 862 | 0.5143 | 1.25E-01 |
| GSE67916 | MCF-7/TAMR:MCF-7 | 12227 | ||||
| GSE12708 | SUM44/LCCTam: SUM44 | 2538 | 1588 | 625 | 0.3936 | 1.00E + 00 |
| GSE118713 | MCF-7/TAMR:MCF-7 | 10023 | ||||
| GSE12708 | SUM44/LCCTam: SUM44 | 2538 | 1630 | 840 | 0.5153 | 1.12E-01 |
| GSE125738 | T47D-TR:T47D | 10685 | ||||
| GSE8562 | MCF7/XBP1: MCF7 | 97 | 80 | 42 | 0.5250 | 3.69E-01 |
| GSE14986 | T8/17/29/52: MCF7 | 13391 | ||||
| GSE8562 | MCF7/XBP1: MCF7 | 97 | 84 | 46 | 0.5476 | 2.23E-01 |
| GSE21618 | TamR: WT | 15481 | ||||
| GSE8562 | MCF7/XBP1: MCF7 | 97 | 57 | 30 | 0.5263 | 3.96E-01 |
| GSE67916 | MCF-7/TAMR:MCF-7 | 12227 | ||||
| GSE8562 | MCF7/XBP1: MCF7 | 97 | 63 | 25 | 0.3968 | 9.62E-01 |
| GSE118713 | MCF-7/TAMR:MCF-7 | 10023 | ||||
| GSE8562 | MCF7/XBP1: MCF7 | 97 | 63 | 25 | 0.3968 | 9.62E-01 |
| GSE125738 | T47D-TR:T47D | 10685 | ||||
| GSE67916 | MCF-7/TAMR:MCF-7 | 12227 | 5751 | 2910 | 0.5060 | 1.85E-01 |
| GSE125738 | T47D-TR:T47D | 10685 |
Consistency evaluation between tissues and cell lines.
| GEO Acc | Def_gene | Com_gene | Con_gene | Ratio | |
| GSE26459 | 6375 | 114 | 84 | 0.7368 | 2.13E-07 |
| GSE27473 | 15937 | 211 | 93 | 0.4408 | 9.63E-01 |
| GSE12708 | 2538 | 46 | 15 | 0.3261 | 9.94E-01 |
| GSE8562 | 97 | 5 | 3 | 0.6000 | 5.00E-01 |
| GSE14986 | 13391 | 178 | 55 | 0.3090 | 1.00E + 00 |
| GSE21618 | 15481 | 207 | 82 | 0.3961 | 9.99E-01 |
| GSE67916 | 12227 | 162 | 61 | 0.3765 | 9.99E-01 |
| GSE118713 | 10023 | 159 | 63 | 0.3962 | 9.97E-01 |
| GSE125738 | 10685 | 159 | 32 | 0.2013 | 1.00E + 00 |
KEGG pathway enrichment of tissue and cell line.
| Tissue | Cell line | ||||
| Pathway num | Pathway namea | Pathway num | Pathway nameb | FDR | |
| hsa04110 | Cell cycle | 0.0270 | hsa03013 | RNA transport | 4.62E-08 |
| hsa04115 | p53 signaling pathway | 0.0226 | hsa03010 | Ribosome | 1.14E-05 |
| hsa04114 | Oocyte meiosis | 0.0726 | hsa00970 | Aminoacyl-tRNA biosynthesis | 1.82E-05 |
| hsa04914 | Progesterone-mediated oocyte maturation | 0.1176 | hsa03008 | Ribosome biogenesis in eukaryotes | 1.64E-04 |
| hsa03440 | Homologous recombination | 0.3907 | hsa03040 | Spliceosome | 7.40E-04 |
| hsa04672 | Intestinal immune network for IgA production | 0.8288 | hsa03410 | Base excision repair | 1.98E-03 |
| hsa04060 | Cytokine-cytokine receptor interaction | 0.9977 | hsa00620 | Pyruvate metabolism | 9.57E-03 |
| hsa01230 | Biosynthesis of amino acids | 0.0119 | |||
| hsa01100 | Metabolic pathways | 0.0194 | |||
| hsa01212 | Fatty acid metabolism | 0.0194 | |||
| hsa01200 | Carbon metabolism | 0.0214 | |||
| hsa00510 | N-Glycan biosynthesis | 0.0244 | |||
| hsa00531 | Glycosaminoglycan degradation | 0.0244 | |||
| hsa04360 | Axon guidance | 0.0244 | |||
| hsa04612 | Antigen processing and presentation | 0.0244 | |||
| hsa04917 | Prolactin signaling pathway | 0.0257 | |||
| hsa00511 | Other glycan degradation | 0.0272 | |||
| hsa04144 | Endocytosis | 0.0272 | |||
| hsa03018 | RNA degradation | 0.0300 | |||
| hsa04142 | Lysosome | 0.0322 | |||
| hsa04330 | Notch signaling pathway | 0.0513 | |||
| hsa01040 | Biosynthesis of unsaturated fatty acids | 0.0573 | |||
| hsa04722 | Neurotrophin signaling pathway | 0.0754 | |||
| hsa04910 | Insulin signaling pathway | 0.0872 | |||
| hsa01210 | 2-Oxocarboxylic acid metabolism | 0.0945 | |||
| hsa04141 | Protein processing in endoplasmic reticulum | 0.1101 | |||
| hsa00280 | Valine, leucine and isoleucine degradation | 0.1121 | |||
| hsa04120 | Ubiquitin mediated proteolysis | 0.1121 | |||
| hsa00270 | Cysteine and methionine metabolism | 0.1319 | |||
| hsa00020 | Citrate cycle (TCA cycle) | 0.1527 | |||
| hsa03050 | Proteasome | 0.1848 | |||
FIGURE 2Performance of our signature in independent dataset. (A) RFS curves in the combined data from datasets GSE4922 and GSE2990. (B) RFS curves in the dataset GSE42568. (C) RFS curves in the dataset GSE9195. (D) RFS curves in the combined data from datasets GSE42568 and GSE9195.