| Literature DB >> 29699511 |
Shujun Huang1,2, Leigh Murphy1,3, Wayne Xu4,5,6.
Abstract
BACKGROUND: Breast cancer is a heterogeneous disease and personalized medicine is the hope for the improvement of the clinical outcome. Multi-gene signatures for breast cancer stratification have been extensively studied in the past decades and more than 30 different signatures have been reported. A major concern is the minimal overlap of genes among the reported signatures. We investigated the breast cancer signature genes to address our hypothesis that the genes of different signature may share common functions, as well as to use these previously reported signature genes to build better prognostic models.Entities:
Keywords: Breast cancer; Common function; Common gene; Multi-gene signature
Mesh:
Year: 2018 PMID: 29699511 PMCID: PMC5921990 DOI: 10.1186/s12885-018-4388-4
Source DB: PubMed Journal: BMC Cancer ISSN: 1471-2407 Impact factor: 4.430
Fig. 1Project schema Three components: common genes and common functions analyses, common gene classifier test, and YMR model test using common genes
Published signatures included in the analysis
| Full name | Short namea | gene # | Extract # | Cov %b | Used forc | Platformd | Purposee | ||
|---|---|---|---|---|---|---|---|---|---|
| Prognosis | Predict | Subtyping | |||||||
| 70-gene signature [ | Mamma | 70 | 66 | 100 | ER+ | Agilent Hu25K | yes | yes | no |
| 21-gene signature [ | RS | 16 | 16 | 100 | ER+ | RT-PCR | yes | yes | no |
| 97-gene genomic grade index [ | GGI97 | 97 | 94 | 100 | ER+ | Affymetrix U133A | yes | yes | no |
| 8-gene genomic grade index [ | GGI8 | 4 | 4 | 100 | ER+ | qRT-PCR | yes | yes | no |
| EndoPredict assay [ | Endo | 8 | 8 | 100 | ER+ | qRT-PCR | yes | yes | no |
| Breast cancer index [ | BCI | 7 | 7 | 100 | ER+ | qRT-PCR | yes | yes | no |
| HOXB13:IL17 BR ratio [ | HI | 2 | 2 | 100 | ER+ | Agilent Arcturus 22 k | yes | yes | no |
| IHC4 Score [ | IHC4 | 4 | 4 | 100 | ER+ | IHC | yes | yes | no |
| 14-gene metastasis score [ | MS14 | 14 | 14 | 100 | ER+ | RT-PCR | yes | no | no |
| 8-gene score [ | SMS | 8 | 8 | 100 | ER+ | qRT-PCR | yes | no | no |
| PAM50 assay [ | PAM50 | 50 | 50 | 100 | BC | Agilent human 1Av2 | yes | yes | yes |
| 76-gene signature [ | Wang | 76 | 71 | 100 | BC | Affymetrix U133A | yes | yes | no |
| 64-gene expression signature [ | Pawitan | 64 | 48 | 75 | BC | Affymetrix U133 | yes | yes | no |
| 32-gene p53 status signature [ | p53 | 32 | 21 | 100 | BC | Affymetrix U133 A or B | yes | yes | yes |
| Cell cycle pathway signature [ | CCPS | 26 | 26 | 100 | BC | Affymetrix U95 Av2 or U133 | yes | no | no |
| 127-gene classifier [ | Robust | 127 | 127 | 100 | BC | Affymetrix microarray | yes | no | no |
| 26-gene stroma-derived prognostic predictor [ | SDPP | 26 | 26 | 100 | BC | Agilent 44 k | yes | no | no |
| 54-gene lung metastasis signature [ | LM | 54 | 54 | 100 | BC | Affymetrix U133A | yes | no | no |
| 186-invasivenessgene signature [ | IGS | 186 | 186 | 100 | BC | Affymetrix U133 A or B | yes | no | no |
| 92-gene predictor [ | Chang | 92 | 86 | 100 | BC | Affymetrix U95 Av2 | no | yes | no |
| 85-gene signature [ | Iwao | 85 | 73 | 100 | BC | ATAC-PCR | no | yes | no |
| 512-gene signature [ | Olaf | 512 | 355 | 69 | BC | Human Oligo set 2.1 | no | yes | no |
| 7-gene immune response module [ | IR7 | 7 | 7 | 100 | ER- | Agilent and Affymetrix microarray | yes | no | yes |
| T-cell metagene [ | Tcell | 50 | 50 | 100 | ER- | Affymetrix U133A | yes | no | no |
| Multigene HRneg/Tneg signature [ | Multigene | 14 | 14 | 100 | TNBC | Affymetrix U133A | yes | no | no |
| 26-gene signature [ | Novel1 | 26 | 20 | 100 | TNBC | Affymetrix U133 | yes | yes | no |
| 264-gene signature [ | Novel2 | 264 | 225 | 100 | TNBC | Affymetrix U133 | yes | no | no |
| B-cell:IL8 ratio [ | Bcell | 22 | 20 | 100 | 100 TNBC | Affymetrix U133A or U133 Plus 2.0 | yes | no | no |
| MAGE-A [ | MAGEA | 2 | 2 | 100 | TNBC | Affymetrix U133 | yes | no | no |
| 368-gene medullary breast cancer like signature [ | MBC | 368 | 368 | 100 | TNBC | Affymetrix or Agilent aaray | yes | no | yes |
| 158-gene HER2-derived prognostic predictor [ | HDPP | 158 | 158 | 100 | HER2+ | SWEGENE H_v2.1.1 55 K | yes | no | yes |
| GCNs of MET and HGF [ | GCN | 2 | 2 | 100 | HER2+ | Fluorescence in situ hybridisation (FISH) | no | yes | No |
| 28-gene expression profile [ | Vegran | 28 | 27 | 100 | HER2+ | Affymetrix microarray | no | yes | no |
aThe short name for the full name used in this paper
bFor some signatures with 100% coverage (all signature genes were found in data set), the extracted gene No (Extract# column) is less than the reported gene No (gene# column) because some genes are duplicated with different probe names within a signature
cThe subtype the signatures are developed for: ER+, the signature is used for ER-positive breast cancer; ER-, the signature is used for ER-negative breast cancer; uc-BC, the signature is used for un-classified breast cancer with mixed subtypes; TNBC, the signature is used for TNBC or basal breast cancer; HER2+, the signature is used for HER2-positive breast cancer
dThe experimental platform used for developing the signatures
eThe clinical purpose of these signatures: prognosis, the signature can be used for prognosis; prediction, the signature can be used for predicting the response to treatment or drug; subtyping, the signature can be used for further subtyping breast cancers
Fig. 2The common genes or function terms from breast cancer multiple signatures. a The top 30 common genes overlapped by different signatures. The official gene symbols were used for overlap analysis. b The top 30 common function terms among the 33 signatures. The GO biological process terms and KEGG pathways were analysed for each signature gene list using DAVID software. The signatures were grouped by tumour subtypes they were derived or applied for
Fig. 3Heat map of the number of common genes or function terms between signatures. a A common gene matrix of 33 signatures by 33 signatures was shown by colour code. The two signatures share at least three common genes (red); the two signatures share one or two common genes (grey); the two signatures share none common genes (green). b A common function terms matrix of 29 signatures by 29 signatures was shown by colour code. Four out of the total 33 signatures did not have enriched significant function terms. The two signatures share at least three common terms (red); the two signatures share one or two common terms (grey); the two signatures share none common terms (green)
Fig. 4Evaluation of the value of the common genes in building breast cancer subtyping classifier. a 1141 METABRIC samples including 144 normal breast tissue samples were clustered by the 62 common genes that shared by at least three signatures using 2D Euclidean clustering with complete linkage settings. The clusters were selected by level 3 or 4 branches. Nine clusters were selected (C1 to C9) including the normal sample cluster C1. The PAM50 subtypes were indicated under the nine cluster colour bars. b Kaplan-Meier survival analysis for the nine subgroups. The disease specific survival (DSS) time was used for outcome endpoint
Fig. 5Evaluation of the Yin Yang gene expression ratio (YMR) model using the common genes. a Defining the Yin and Yang genes from the 220 common genes using the 1141 METABRIC samples data. The rows are the 220 genes that were shared by at least two signatures and the columns are the 1141 samples. The genes that showed consistently higher expression in normal breast tissue samples (in blue) and relative consistently lower expression in various tumour subtypes were selected as Yang genes (red). The genes that showed consistently lower expression in normal samples and higher expression in tumour samples were selected as Yin genes (green). The PAM50 subtypes were indicated by colour bars. b The YMR-all signature model developed by the selected Yin and Yang genes was tested by Kaplan-Meier survival analysis using two data sets
Fig. 6Signature comparison in ER+/Node-negative patients. The comparison was conducted bioinconductor package geneFu using patient samples from 5 Bioconductor data sets (Materials and Methods). Total 541 ER+/Node- patients who did not undertake adjuvant treatment were stratified by the median score of each signature and the significance was assessed by log-rank test of the Kaplan-Meier analysis using the recurrence free survival (RFS) rate