| Literature DB >> 20703321 |
Yupeng Wang1, Romdhane Rekaya.
Abstract
Detection of differential gene expression using microarray technology has received considerable interest in cancer research studies. Recently, many researchers discovered that oncogenes may be activated in some but not all samples in a given disease group. The existing statistical tools for detecting differentially expressed genes in a subset of the disease group mainly include cancer outlier profile analysis (COPA), outlier sum (OS), outlier robust t-statistic (ORT) and maximum ordered subset t-statistics (MOST). In this study, another approach named Least Sum of Ordered Subset Square t-statistic (LSOSS) is proposed. The results of our simulation studies indicated that LSOSS often has more power than previous statistical methods. When applied to real human breast and prostate cancer data sets, LSOSS was competitive in terms of the biological relevance of top ranked genes. Furthermore, a modified hierarchical clustering method was developed to classify the heterogeneous gene activation patterns of human breast cancer samples based on the significant genes detected by LSOSS. Three classes of gene activation patterns, which correspond to estrogen receptor (ER)+, ER- and a mixture of ER+ and ER-, were detected and each class was assigned a different gene signature.Entities:
Keywords: cancer; differential gene expression; outlier
Year: 2010 PMID: 20703321 PMCID: PMC2918352 DOI: 10.4137/bmi.s5175
Source DB: PubMed Journal: Biomark Insights ISSN: 1177-2719
Figure 1.ROC curves comparing different statistical methods.
Genes confirmed to be associated with breast cancer that are ranked on the top 25 identified using different cancer outlier detection approaches.
| ATM | IL6 | IL6 | ATM | SLC3 A2 | KCNH2 |
| FRAP1 | LCN2 | AGTR1 | ERBB4 | CGA | NEO1 |
| SOD2 | PAK1 | THRA | MUC5B | MAGEA3 | |
| CASC3 | SMARCA4 | CENPB | ENG | ||
| TRADD | HDC | GABRG2 | |||
| CTAG1B | IGFBP5 | ATM | |||
| AGTR1 | FOLR1 | NUP88 | |||
| CASC3 | CKB | CYP3 A7 | |||
| PMP22 |
Genes confirmed to be associated with prostate cancer that are ranked on the top 25 identified using different cancer outlier detection approaches.
| UBE2E3 | ELF1 | ELF1 | ELF1 | ELF1 | RB1 |
| BRCA2 | CTCF | CAV2 | RB1 | PAK2 | UBE2E3 |
| CFTR | BMI1 | ||||
| CTCF | BTG2 | ||||
| ELF1 |
Figure 2.Color image for classification of heterogeneous gene activation patterns of human breast cancer.
Classes and biomarkers of heterogeneous gene activation patterns of human breast cancer.
| Involved Samples | 1 (ER+/LN+/Nevins4), | 4 (ER−/LN+/Nevins7), | 8 (ER+/LN+/Nevins13), |
| Gene signatures | 24 (CYP3A7) | 7 (TALDO1) | 27 (GYPA) |
| 35 (P2RX4) | 11 (NEO1) | 45 (CTBP1) | |
| 37 (DHFR) | 13 (RDBP) | 51 (WNT5 A) | |
| 38 (UBB) | 20 (ATM) | 57 (PTHLH) | |
| 45 (CTBP1) | 21 (CLEC10A) | 63 (COPS6) | |
| 47 (RAB35) | 33 (SRM) | 69 (DLG3) | |
| 48 (RAC1) | 38 (UBB) | 70 (FZD2) | |
| 53 (SERPINB6) | 39 (APBA2) | 84 (STAT5B) | |
| 61 (ROS1) | 41 (SOX3) | 91 (PTPN1) | |
| 68 (LRRC14) | 45 (CTBP1) | 97 (MAPK14) | |
| 77 (SLC35D1) | 50 (GRK5) | 110 (CALM2) | |
| 80 (HOXB8) | 59 (HRK) | 114 (LPO) | |
| 84 (STAT5B) | 69 (DLG3) | 115 (NPY1R) | |
| 86 (NGF) | 78 (TAX1BP1) | 116 (GPR68) | |
| 97 (MAPK14) | 80 (HOXB8) | 117 (FBP1) | |
| 99 (MNAT1) | 91 (PTPN1) | 120 (ZNF138) | |
| 103 (CYP2D7P1) | 92 (RPL24) | 122 (BRD2) | |
| 105 (MSMB) | 93 (F2RL1) | 135 (TCL6) | |
| 107 (ACOT2) | 97 (MAPK14) | 153 (SLC6A11) | |
| 109 (ERBB3) | 98 (KRTAP5–9) | 162 (SMG1) | |
| 112 (CASP8) | 110 (CALM2) | 166 (POU2F2) | |
| 115 (NPY1R) | 115 (NPY1R) | 168 (UBE2H) | |
| 116 (GPR68) | 116 (GPR68) | 169 (CLPS) | |
| 117 (FBP1) | 120 (ZNF138) | 173 (MMP11) | |
| 118 (THBS4) | 122 (BRD2) | 182 (CTRL) | |
| 122 (BRD2) | 125 (KRR1) | 187 (NDST1) | |
| 125 (KRR1) | 133 (PKLR) | 191 (ESR1) | |
| 128 (SLC39A6) | 144 (ADAM3B) | 194 (FMO1) | |
| 133 (PKLR) | 146 (ERG) | 197 (ADH6) | |
| 138 (C11orf58) | 148 (MYOD1) | 210 (ICAM3) | |
| 151 (MDS1) | 151 (MDS1) | 216 (IRF7) | |
| 157 (PSMC5) | 156 (SMPD1) | 221 (NA) | |
| 164 (RPL26) | 158 (SFTPD) | 225 (ASGR2) | |
| 165 (RPL34) | 165 (RPL34) | ||
| 169 (CLPS) | 169 (CLPS) | ||
| 172 (TCEAL1) | 177 (PPA2) | ||
| 183 (GYPE) | 182 (CTRL) | ||
| 185 (SEMA3F) | 192 (BCL2L1) | ||
| 186 (CYFIP2) | 202 (GNB2L1) | ||
| 187 (NDST1) | 210 (ICAM3) | ||
| 191 (ESR1) | 211 (FGFR2) | ||
| 197 (ADH6) | 212 (IL8RB) | ||
| 198 (BRD2) | 228 (KRT4) | ||
| 210 (ICAM3) | |||
| 214 (COX6C) | |||
| 215 (APBB2) | |||
| 216 (IRF7) | |||
| 221 (NA) | |||
Notes:
Data are shown in the format of “sample index (sample name)”;
Data are shown in the format of “gene ranking (gene symbol)”.
Classification of the cancer samples lacking significant common outliers.
| 11 (ER+/LN+/Nevins40) | 0.328 | 0.326 | 0.346 | Mixture |
| 16 (ER−/LN+/Nevins99) | 0.329 | 0.332 | 0.339 | Mixture |
| 17 (ER+/LN+/Marks205) | 0.348 | 0.328 | 0.324 | ER+ |
| 20 (ER+/LN+/Marks208) | 0.333 | 0.304 | 0.363 | Mixture |
| 21 (ER-/LN+/Marks214) | 0.307 | 0.362 | 0.331 | ER− |
| 22 (ER-/LN+/Marks215) | 0.337 | 0.288 | 0.375 | Mixture |
| 23 (ER-/LN+/Marks216) | 0.357 | 0.261 | 0.382 | Mixture |
| 24 (ER-/LN+/Marks217) | 0.300 | 0.304 | 0.396 | Mixture |