| Literature DB >> 18226260 |
David D Smith1, Pål Saetrom, Ola Snøve, Cathryn Lundberg, Guillermo E Rivas, Carlotta Glackin, Garrett P Larson.
Abstract
BACKGROUND: Gene expression measurements from breast cancer (BrCa) tumors are established clinical predictive tools to identify tumor subtypes, identify patients showing poor/good prognosis, and identify patients likely to have disease recurrence. However, diverse breast cancer datasets in conjunction with diagnostic clinical arrays show little overlap in the sets of genes identified. One approach to identify a set of consistently dysregulated candidate genes in these tumors is to employ meta-analysis of multiple independent microarray datasets. This allows one to compare expression data from a diverse collection of breast tumor array datasets generated on either cDNA or oligonucleotide arrays.Entities:
Mesh:
Substances:
Year: 2008 PMID: 18226260 PMCID: PMC2275244 DOI: 10.1186/1471-2105-9-63
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Breast Cancer Gene Expression Datasets used in Meta-Analysis
| Wang, Y. et al. | Lancet [73] | Affy, 22283 | 209 | 77 | DFS 5 Yr |
| Zhao, H. et al. | Mol Biol Cell [80] | cDNA, 27276 | 24 | 11 | PR Status, Grade, HER2, LN Status |
| Sotiriou, C. et al. | PNAS [81] | cDNA, 7549 | 65 | 34 | LN Status, Chemo/Radio/Horm Tx, 5 Yr OS |
| Ma, X. et al. | PNAS [82] | cDNA, 1940 | 18 | 5 | PR, Grade, HER2, Grade, Histology |
| Van de Vijver, M. et al. | NEJM [83] | cDNA, 23130 | 226 | 69 | DFS 5 Yr, LN Status, T/M Stage |
| Gruvberger, S. et al. | Ca Res [84] | cDNA, 3369 | 28 | 30 | |
| Sorlie, T. et al. | PNAS [2] | cDNA, 7937 | 56 | 18 | DFS 5 Yr, LN Status, M Stage |
| West, M. et al. | PNAS [85] | Affy, 6718 | 25 | 24 | |
| Perou, C. et al. | PNAS [1] | cDNA, 8838 | 26 | 9 | Before/After Chemo, Histology, Grade |
DFS, Disease Free Survival; Pr, Progesterone status; LN, Lymph node status; OS, overall survival
Figure 1Plot of S over N versus S over standard deviation for all genes across all studies in the meta-analysis. Some genes present in ER+ overexpressed tumors (ESR1 and GATA3) and ER- overexpressed tumors (LAD1 and NFIB) are indicated. The top 5% of genes also include the top 1% of genes.
Ingenuity functional roles among the top 1% ER+ and ER- upregulated genes.
| Function | N Genes | Ingenuity p-value |
| Small Molecule Biochemistry | 16 | 7.71E-07 |
| Molecular Transport | 10 | 2.03E-04 |
| Nervous System Development and Function | 10 | 5.89E-03 |
| Lipid Metabolism | 9 | 7.71E-07 |
| Cancer | 7 | 5.89E-03 |
| Cancer | 26 | 4.08E-04 |
| Cellular Growth and Proliferation | 23 | 3.34E-07 |
| Cell Death | 22 | 9.14E-08 |
| Tissue Morphology | 19 | 3.60E-05 |
| Hematological System Development and Function | 18 | 6.73E-06 |
Top Scoring Promoter TFBS Motifs Identified in Coding and Non-coding Strands. Top 1% and 5% Gene Sets.
| 1 | CTTTGA | LEF1 | 83 | 55 | 64 | 83 | 0.0063 | 0.7784 |
| 2 | TATAAATW | TBP | 32 | 106 | 17 | 130 | 0.0117 | 1 |
| 3 | MGGAWGT | PEA3 | 72 | 66 | 55 | 92 | 0.0128 | 1 |
| 4 | TnGCGTG | AHR | 39 | 99 | 23 | 124 | 0.0142 | 1 |
| 5 | WADTAAWTA | NKX6-2 | 53 | 85 | 36 | 111 | 0.0149 | 1 |
| 1 | ABWCAGGTRnR | AREB6 | 13 | 125 | 2 | 145 | 0.0026 | 0.3205 |
| 2 | KTWGTTT | SRY | 91 | 47 | 71 | 76 | 0.0029 | 0.3487 |
| 3 | ATTGTT | SOX-5 | 77 | 61 | 56 | 91 | 0.0030 | 0.3683 |
| 4 | GCGCSAAA | E2F | 0 | 138 | 8 | 139 | 0.0073 | 0.8725 |
| 5 | RnCAGGTG | MYOD | 68 | 70 | 50 | 97 | 0.0114 | 1 |
| 1 | ABWCAGGTRnR | AREB6 | 25 | 113 | 6 | 141 | 0.0002 | 0.0240 |
| 2 | CTTTGA | LEF1 | 108 | 30 | 96 | 51 | 0.0180 | 1 |
| 3 | KTWGTTT | SRY | 120 | 18 | 110 | 37 | 0.0107 | 1 |
| 4 | ATTGTT | SOX-5 | 100 | 38 | 93 | 54 | 0.1014 | 1 |
| 5 | RWAAACAA | FOXO1 | 96 | 42 | 83 | 64 | 0.0272 | 1 |
| 1 | SCACGTG | MYC | 141 | 594 | 190 | 576 | 0.0089 | 1 |
| 2 | GnCnGTT | MYB | 590 | 145 | 579 | 187 | 0.0296 | 1 |
| 3 | RTGACTCAGCA | NF-E2 | 0 | 735 | 6 | 760 | 0.0311 | 1 |
| 4 | MGGAWGT | PEA3 | 358 | 377 | 332 | 434 | 0.0384 | 1 |
| 5 | KTWGTTT | SRY | 473 | 262 | 454 | 312 | 0.0437 | 1 |
| 1 | KTWGTTT | SRY | 473 | 262 | 423 | 343 | 0.0003 | 0.0422 |
| 2 | GCGCSAAA | E2F | 13 | 722 | 31 | 735 | 0.0092 | 1 |
| 3 | CYAATTWT | HOXA4 | 306 | 429 | 271 | 495 | 0.0146 | 1 |
| 4 | TYAAGTG | NKX2-5 | 276 | 459 | 242 | 524 | 0.0168 | 1 |
| 5 | GCCATnTT | YY1 | 168 | 567 | 137 | 629 | 0.0176 | 1 |
| 1 | KTWGTTT | SRY | 613 | 122 | 606 | 160 | 0.0346 | 1 |
| 2 | WGATAR | GATA | 691 | 44 | 698 | 68 | 0.0388 | 1 |
| 3 | TYAAGTG | NKX2-5 | 450 | 285 | 420 | 346 | 0.0139 | 1 |
| 4 | CCGGAART | ELK-1 | 97 | 638 | 69 | 697 | 0.0106 | 1 |
| 5 | GTTRCYWnGYnAC | RFX1 | 12 | 723 | 4 | 762 | 0.0441 | 1 |
Top Scoring Promoter Phylogenetic Motifs identified in Coding and Non-coding Strands. Top 1% and 5% Gene Sets.
| 1 | TCAnnTGAY | SREBP-1 | -64 | 68 | 70 | 44 | 103 | 0.0010 | 0.1794 |
| 2 | TAATTA | CHX10 | - | 70 | 68 | 46 | 101 | 0.0011 | 0.1895 |
| 3 | RnTCAnnRnnYnATTW | - | - | 15 | 123 | 3 | 144 | 0.0027 | 0.4563 |
| 4 | CATTGTYY | SOX-9 | - | 30 | 108 | 13 | 134 | 0.0027 | 0.4690 |
| 5 | CTTTGA | LEF1 | - | 83 | 55 | 64 | 83 | 0.0063 | 1 |
| 1 | CAGnYGKnAAA | - | - | 19 | 119 | 3 | 144 | 0.0002 | 0.0373 |
| 2 | TAATTA | CHX10 | - | 70 | 68 | 46 | 101 | 0.0011 | 0.1895 |
| 3 | TTAnWnAnTGGM | - | - | 14 | 124 | 2 | 145 | 0.0014 | 0.2349 |
| 4 | TTGTTT | FOXO4 | - | 98 | 40 | 79 | 68 | 0.0033 | 0.5672 |
| 5 | YYCATTCAWW | POU1F1(*) | - | 21 | 117 | 7 | 140 | 0.0046 | 0.7760 |
| 1 | TAATTA | CHX10 | - | 70 | 68 | 46 | 101 | 0.0011 | 0.1906 |
| 2 | CAGnYGKnAAA | - | - | 26 | 112 | 9 | 138 | 0.0011 | 0.1918 |
| 3 | TAAWWATAG | RSRFC4 | - | 31 | 107 | 14 | 133 | 0.0033 | 0.5631 |
| 4 | TTGTTT | FOXO4 | - | 121 | 17 | 116 | 31 | 0.0574 | 1 |
| 5 | CATTGTYY | SOX-9 | - | 45 | 93 | 27 | 120 | 0.0064 | 1 |
| 1 | CTTTAAR | - | - | 329 | 406 | 282 | 484 | 0.0019 | 0.3351 |
| 2 | YKACATTT | - | - | 174 | 561 | 133 | 633 | 0.0026 | 0.4526 |
| 3 | TAATTA | CHX10 | - | 341 | 394 | 301 | 465 | 0.0057 | 0.9801 |
| 4 | YATTnATC | CDP(*) | - | 183 | 552 | 147 | 619 | 0.0088 | 1 |
| 5 | TTGCWCAAY | C/EBPBETA | - | 15 | 720 | 34 | 732 | 0.0090 | 1 |
| 1 | WTGAAAT | - | - | 307 | 428 | 263 | 503 | 0.0034 | 0.5955 |
| 2 | TTGTTT | FOXO4 | - | 491 | 244 | 456 | 310 | 0.0039 | 0.6670 |
| 3 | TAATTA | CHX10 | - | 341 | 394 | 301 | 465 | 0.0057 | 0.9801 |
| 4 | YCATTAA | IPF1(*) | - | 204 | 531 | 166 | 600 | 0.0070 | 1 |
| 5 | TTAYRTAA | E4BP4 | - | 129 | 606 | 97 | 669 | 0.0093 | 1 |
| 1 | TGCCAAR | NF-1 | - | 442 | 293 | 394 | 372 | 0.0007 | 0.1271 |
| 2 | YCATTAA | IPF1(*) | - | 326 | 409 | 282 | 484 | 0.0032 | 0.5575 |
| 3 | YATGnWAAT | OCT-X | - | 250 | 485 | 209 | 557 | 0.0051 | 0.8729 |
| 4 | TAATTA | CHX10 | - | 341 | 394 | 301 | 465 | 0.0057 | 0.9744 |
| 5 | AACYnnnnTTCCS | - | -53 | 76 | 659 | 49 | 717 | 0.0066 | 1 |
*Center of motif relative to transcriptional start site
Figure 2Box plot of 3'UTR length differences. Summary of ER+ upregulated genes ("Top1 u" and "Top5 u") and ER- upregulated genes ("Top1 d" and "Top5 d"). 3'UTR lengths were derived from RefSeq gene conversion as shown in Table 7.
Top Scoring Phylogenetic 3'UTR Motifs Identified in 3'UTRs. Top 1% and 5% Gene Sets.
| 1 | YACTGCCR | 17 | 113 | 2 | 143 | 0.0002 | 0.0423 |
| 2 | YYGCATGT | 10 | 120 | 1 | 144 | 0.0037 | 1.0000 |
| 3 | TGTANANAGA | 12 | 118 | 3 | 142 | 0.0142 | 1.0000 |
| 4 | TGCMNTAA | 26 | 104 | 14 | 131 | 0.0168 | 1.0000 |
| 5 | TGTGAA | 51 | 79 | 37 | 108 | 0.0196 | 1.0000 |
| 6 | TGTANNNTAG | 13 | 117 | 4 | 141 | 0.0214 | 1.0000 |
| 7 | TTTCTRNNAAA | 2 | 128 | 11 | 134 | 0.0219 | 1.0000 |
| 8 | AAGCACA | 19 | 111 | 9 | 136 | 0.0273 | 1.0000 |
| 9 | CTAKWTTT | 23 | 107 | 12 | 133 | 0.0286 | 1.0000 |
| 10 | TTTCTA | 52 | 78 | 41 | 104 | 0.0423 | 1.0000 |
| 1 | WGCCTTA | 134 | 562 | 80 | 652 | < 0.0001 | 0.0031 |
| 2 | CTAKWTTT | 149 | 547 | 93 | 639 | < 0.0001 | 0.0032 |
| 3 | TGTGAA | 305 | 391 | 238 | 494 | < 0.0001 | 0.0034 |
| 4 | TATATTT | 210 | 486 | 149 | 583 | < 0.0001 | 0.0066 |
| 5 | TGTANNNTAG | 73 | 623 | 36 | 696 | 0.0001 | 0.0235 |
| 6 | TGTRNNNWATT | 148 | 548 | 101 | 631 | 0.0002 | 0.0571 |
| 7 | WRCCAAAA | 113 | 583 | 71 | 661 | 0.0003 | 0.0706 |
| 8 | TGTATANW | 218 | 478 | 168 | 564 | 0.0004 | 0.1144 |
| 9 | CTGTATWW | 134 | 562 | 91 | 641 | 0.0005 | 0.1248 |
| 10 | TGTRNTTT | 310 | 386 | 261 | 471 | 0.0007 | 0.1747 |
Top Scoring 6-mer and 7-mer miRNA seeds identified in normalized 3'UTRs. Top 1% and 5% Gene Sets.
| Top 1% 6-mer | Top 1% 7-mer | |||||||||
| 1 | ATCTGG | 0.0002 | 0.0547 | 0.000 | 0.018 | CACTGCC | 0.0011 | 0.4099 | 0.008 | 0.003 |
| 2 | GGTACT | 0.0005 | 0.1811 | 0.018 | 0.001 | ACTATTA | 0.0012 | 0.4328 | 0.000 | 0.019 |
| 3 | AGCACA | 0.0015 | 0.5048 | 0.005 | 0.004 | TCTAGAG | 0.0079 | 1.0000 | 0.021 | 0.006 |
| 4 | CACTTT | 0.0017 | 0.5921 | 0.001 | 0.029 | ATTACAT | 0.0113 | 1.0000 | 0.012 | 0.042 |
| 5 | ACTGCC | 0.0019 | 0.6339 | 0.004 | 0.028 | GTCAACC | 0.0146 | 1.0000 | 0.028 | 0.007 |
| 6 | CTATTA | 0.0025 | 0.8495 | 0.007 | 0.025 | TGTATTA | 0.0175 | 1.0000 | 0.052 | 0.022 |
| 7 | AGTTTT | 0.0035 | 1.0000 | 0.086 | 0.006 | TGGTACT | 0.0214 | 1.0000 | 0.007 | 0.117 |
| 8 | GACACA | 0.0059 | 1.0000 | 0.076 | 0.006 | AAAGGGA | 0.0228 | 1.0000 | 0.294 | 0.000 |
| 9 | AGTCCA | 0.0070 | 1.0000 | 0.027 | 0.006 | AAGCACA | 0.0273 | 1.0000 | 0.012 | 0.049 |
| 10 | AGAGTT | 0.0076 | 1.0000 | 0.218 | 0.001 | GTGTTGA | 0.0279 | 1.0000 | 0.462 | 0.000 |
| Top 5% 6-mer | Top 5% 7-mer | |||||||||
| 1 | ATTATA | 0.0000 | 0.0004 | 0.000 | 0.004 | TGCCTTA | 0.0000 | 0.0056 | 0.000 | 0.000 |
| 2 | GCCTTA | 0.0000 | 0.0031 | 0.000 | 0.002 | ATATGCA | 0.0000 | 0.0126 | 0.000 | 0.001 |
| 3 | TGTTAA | 0.0000 | 0.0038 | 0.000 | 0.000 | TAATAAT | 0.0003 | 0.1053 | 0.002 | 0.000 |
| 4 | TTATAT | 0.0000 | 0.0040 | 0.000 | 0.039 | GATTTTT | 0.0004 | 0.1569 | 0.000 | 0.008 |
| 5 | TGAAGG | 0.0000 | 0.0052 | 0.000 | 0.021 | GTTATAT | 0.0006 | 0.2226 | 0.000 | 0.004 |
| 6 | TAAGCT | 0.0000 | 0.0064 | 0.000 | 0.000 | CCAACTC | 0.0010 | 0.3524 | 0.015 | 0.000 |
| 7 | ACTTCA | 0.0000 | 0.0095 | 0.000 | 0.000 | AATGCAT | 0.0013 | 0.4888 | 0.000 | 0.008 |
| 8 | ATTTCA | 0.0000 | 0.0141 | 0.000 | 0.008 | TCTGATA | 0.0014 | 0.5104 | 0.140 | 0.000 |
| 9 | CATTTG | 0.0000 | 0.0150 | 0.000 | 0.014 | ATTACAT | 0.0014 | 0.5142 | 0.001 | 0.002 |
| 10 | AGTATT | 0.0001 | 0.0197 | 0.000 | 0.062 | TCTGATC | 0.0014 | 0.5186 | 0.000 | 0.107 |
Conversion of UniGene IDs to RefSeq mRNAs utilizing D.A.V.I.D.
| Step | Top 1% S- | Top 1% S+ | Top 5% S- | Top 5% S+ |
| UniGene | 150 | 150 | 902 | 902 |
| RefSeq mapped | 168 | 192 | 1072 | 1116 |
| RefSeq downloaded | 167 | 167 | 850 | 888 |
| RefSeq unique | 138 | 147 | 735 | 766 |
| Step | Top 1% S- | Top 1% S+ | Top 5% S- | Top 5% S+ |
| UniGene | 150 | 150 | 902 | 902 |
| RefSeq mapped | 168 | 192 | 1072 | 1116 |
| RefSeq downloaded | 165 | 166 | 844 | 887 |
| RefSeq unique | 130 | 145 | 696 | 732 |
S-, genes overexpressed in ER+ tumors; S+, genes overexpressed in ER- tumors