| Literature DB >> 26347014 |
Qi Wang1, Xudong Liu1.
Abstract
OBJECTIVE: To screen the feature genes in estrogen receptor-positive (ER+) breast cancer in comparison with estrogen receptor-negative (ER-) breast cancer.Entities:
Keywords: biomarker; classification; differentially expressed genes
Year: 2015 PMID: 26347014 PMCID: PMC4556031 DOI: 10.2147/OTT.S85271
Source DB: PubMed Journal: Onco Targets Ther ISSN: 1178-6930 Impact factor: 4.147
Summary of the nine included microarray data
| Datasets ID | Number of samples | ER+ | ER− | Average age (years) |
|---|---|---|---|---|
| Training sets | ||||
| E-GEOD-3494 | 241 | 209 | 32 | 62.004 |
| E-GEOD-24185 | 94 | 56 | 38 | 49.03 |
| E-GEOD-22597 | 82 | 37 | 45 | 51.23 |
| E-GEOD-22093 | 82 | 41 | 41 | 48.51 |
| E-GEOD-45255 | 127 | 82 | 45 | – |
| Total | 626 | 425 | 201 | 52.69 |
| Testing sets | ||||
| E-GEOD-4922 | 245 | 211 | 34 | 62.12 |
| E-GEOD-32518 | 71 | 40 | 31 | 47.69 |
| E-GEOD-23988 | 61 | 32 | 29 | 48.69 |
| E-GEOD-2034 | 286 | 209 | 77 | – |
| Total | 663 | 492 | 171 | 52.83 |
Abbreviations: ER+, estrogen receptor-positive; ER−, estrogen receptor-negative.
Figure 1Classification of three sample datasets by constructed support vector machine classifier.
Notes: (A) Six hundred and twenty-six samples for training; (B) 663 samples for testing; (C) 1,289 combined samples for testing. (Aa, Ba, and Ca) indicate the sample distribution for ER+ and ER−. (Ab, Bb, and Cb) indicate the scatterplot of the classification, in which black dots represent ER− while red dots represent ER+ breast cancer samples.
Abbreviations: ER+, estrogen receptor-positive; ER−, estrogen receptor-negative.
Effect evaluation of the support vector machine classifier on training and testing datasets
| Number of samples | Correct rate | Sensitivity | Specificity | PPV | NPV | AUROC | |
|---|---|---|---|---|---|---|---|
| Training | 626 | 0.9968 | 0.9976 | 0.9950 | 0.9976 | 0.9950 | 0.999 |
| Testing | 663 | 0.9563 | 0.9736 | 0.9064 | 0.9677 | 0.9226 | 0.816 |
| Combined | 1,289 | 0.9829 | 0.9836 | 0.9812 | 0.9923 | 0.9605 | 0.890 |
Abbreviations: AUROC, area under receiver operating characteristic curve; NPV, negative predictive value; PPV, positive predictive value.
Figure 2Receiver operating characteristic curve used for training, testing, and combined datasets by support vector machine classifier.
Significantly enriched biological functions of feature genes
| Term | Count | Enriched genes | |
|---|---|---|---|
| GO:0010035, response to inorganic substance | 24 | 1.28×10−7 | |
| GO:0007267, cell–cell signaling | 43 | 1.02×10−6 | |
| GO:0009611, response to wounding | 37 | 1.2×10−5 | |
| GO:0006954, inflammatory response | 27 | 1.29×10−5 | |
| GO:0010038, response to metal ion | 16 | 1.3×10−5 | |
| GO:0007610, behavior | 34 | 1.38×10−5 | |
| GO:0010033, response to organic substance | 45 | 1.94×10−5 | |
| GO:0010817, regulation of hormone levels | 17 | 2.2×10−5 |
Abbreviation: GO, gene oncology.
Significantly enriched pathways of feature genes
| ID | Pathway | Enriched genes | |
|---|---|---|---|
| hsa04060 | Cytokine–cytokine receptor interaction | 2.03×10−2 | |
| hsa00982 | Drug metabolism | 2.05×10−2 | |
| hsa04912 | GnRH signaling pathway | 2.07×10−2 | |
| hsa00232 | Caffeine metabolism | 2.28×10−2 | |
| hsa04062 | Chemokine signaling pathway | 3.00×10−2 | |
| hsa04914 | Progesterone-mediated oocyte maturation | 3.01×10−2 | |
| hsa04110 | Cell cycle | 3.03×10−2 | |
| hsa00380 | Tryptophan metabolism | 4.98×10−2 |