| Literature DB >> 22872785 |
Xi Chen1, Jiang Li, William H Gray, Brian D Lehmann, Joshua A Bauer, Yu Shyr, Jennifer A Pietenpol.
Abstract
MOTIVATION: Triple-negative breast cancer (TNBC) is a heterogeneous breast cancer group, and identification of molecular subtypes is essential for understanding the biological characteristics and clinical behaviors of TNBC as well as for developing personalized treatments. Based on 3,247 gene expression profiles from 21 breast cancer data sets, we discovered six TNBC subtypes from 587 TNBC samples with unique gene expression patterns and ontologies. Cell line models representing each of the TNBC subtypes also displayed different sensitivities to targeted therapeutic agents. Classification of TNBC into subtypes will advance further genomic research and clinical applications. RESULT: We developed a web-based subtyping tool TNBCtype for candidate TNBC samples using our gene expression meta data and classification methods. Given a gene expression data matrix, this tool will display for each candidate sample the predicted subtype, the corresponding correlation coefficient, and the permutation P-value. We offer a user-friendly web interface to predict the subtypes for new TNBC samples that may facilitate diagnostics, biomarker selection, drug discovery, and the more tailored treatment of breast cancer.Entities:
Keywords: classification; gene expression microarray; meta-analysis; subtypes; triple-negative breast cancer
Year: 2012 PMID: 22872785 PMCID: PMC3412597 DOI: 10.4137/CIN.S9983
Source DB: PubMed Journal: Cancer Inform ISSN: 1176-9351
Figure 1Workflow for developing the TNBC subtype gene signature.
Notes: After breast cancer gene expression data collection, three major procedures were performed to develop TNBC gene signature. First, TNBC identification by bimodal filtering on ER, PR and HER2 expression. Second, clustering analysis to develop TNBC subtypes. Finally, validation for TNBC subtypes and gene signature.
Figure 2ER positive samples dramatically affect TNBC subtype prediction results. (A) Prediction results for TNBC samples normalized without any ER positive sample; (B) Prediction results for the same TNBC samples normalized in the presence of ER positive samples.
Figure 3ER gene expression for TNBC and ER positive samples.
Note: Boxplot shows the ER gene expression percentile among all genes within a given sample.
Figure 4Snapshot of TNBC prediction outcome.
Notes: For illustration, a cohort with 26 publicly available TNBC samples was tested by TNBC type. Six colors were selected to represent each of the six TNBC subtypes. The table on the left shows the predicted subtype assigned to each sample, the correlation with the corresponding subtype centroid, and the P-value from 1,000 permutations. The color bars on the right show the same information as the table. The height of the bars indicates the magnitude of the correlation coefficients.
The list of public gene expression data used to develop TNBC gene signature.
| GSE5327 | GEO | Sweden | 251 | Training set |
| GSE7904 | GEO | USA | 43 | Training set |
| GSE2109 | GEO | USA | 351 | Training set |
| GSE7390 | GEO | Europe | 198 | Training set |
| ETABM158 | Array express | USA | 100 | Training set |
| GSE2034 | GEO | Netherlands | 286 | Training set |
| GSE2990 | GEO | Sweden | 189 | Training set |
| GSE1456 | GEO | Sweden | 159 | Training set |
| GSE22513, GSE28821, GSE28796 | GEO | USA | 112 | Training set |
| GSE11121 | GEO | Germany | 200 | Training set |
| GSE2603 | GEO | USA | 99 | Training set |
| MDA133 | MD Anderson Cancer Center | USA | 133 | Training set |
| GSE5364 | GEO | Singapore | 183 | Training set |
| GSE1561 | GEO | Belgium | 49 | Training set |
| GSE5327 | GEO | Netherlands | 58 | Validation set |
| GSE5847 | GEO | USA | 96 | Validation set |
| GSE12276 | GEO | Netherlands | 204 | Validation set |
| GSE16446 | GEO | Europe | 120 | Validation set |
| GSE18864 | GEO | USA | 24 | Validation set |
| GSE19615 | GEO | USA | 115 | Validation set |
| GSE20194 | GEO | USA | 278 | Validation set |