| Literature DB >> 15784140 |
Abstract
BACKGROUND: Microarray devices permit a genome-scale evaluation of gene function. This technology has catalyzed biomedical research and development in recent years. As many important diseases can be traced down to the gene level, a long-standing research problem is to identify specific gene expression patterns linking to metabolic characteristics that contribute to disease development and progression. The microarray approach offers an expedited solution to this problem. However, it has posed a challenging issue to recognize disease-related genes expression patterns embedded in the microarray data. In selecting a small set of biologically significant genes for classifier design, the nature of high data dimensionality inherent in this problem creates substantial amount of uncertainty.Entities:
Mesh:
Year: 2005 PMID: 15784140 PMCID: PMC1274261 DOI: 10.1186/1471-2105-6-67
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Genes selected by our method on the microarray dataset of small round blue-cells tumors. Those genes also selected using the methods of Tibshirani et al. [13] and Khan et al. [10] are respectively marked by the symbol •.
| Image ID | Gene Description | Tibshirani et al. | Khan et al. | |
| 21652 | 2.3 × 10-5 | catenin (cadherin-associated protein), alpha 1 | • | • |
| 878280 | 2.3 × 10-5 | collapsin response mediator protein 1 | • | |
| 377461 | < 0.000001 | caveolin 1, caveolae protein | • | • |
| 325182 | 2.3 × 10-5 | cadherin 2, N-cadherin (neuronal) | • | • |
| 1435862 | 0.02 | MIC2 surface antigen (CD99) | • | • |
| 42558 | 0.02 | L-arginine:glycine amidinotransferase | • | • |
| 812105 | < 0.000001 | transmembrane protein | • | • |
| 41591 | < 0.000001 | meningioma 1 | • | • |
| 810057 | < 0.000001 | cold shock domain protein A | • | |
| 183337 | 0.02 | major histocompatibility complex, class II, DM alpha | • | • |
| 796258 | < 0.000001 | sarcoglycan, alpha | • | • |
| 1409509 | 0.02 | troponin T1, skeletal, slow | • | • |
| 788107 | < 0.000001 | amphiphysin-like | • | |
| 770394 | < 0.000001 | Fc fragment of IgG, receptor, transporter, alpha | • | • |
| 82225 | 0.02 | secreted frizzled-related protein 1 | • | |
| 814260 | < 0.000001 | follicular lymphoma variant translocation 1 | • | • |
| 784224 | < 0.000001 | fibroblast growth factor receptor 4 | • | • |
| 308163 | 2.3 × 10-5 | ESTs | • | • |
| 212542 | < 0.000001 | cDNA DKFZp586J2118 | • | • |
Figure 1The gene expression map of the 19 genes selected by our method in the domain concerning classification of SRBCTs. The map was generated by Eisen's hierarchical clustering program called CLUSTER and viewed by the TREEVIEW program. Four sample clusters are visually recognizable, corresponding exactly to the four predefined tumor classes (NB, EWS, BL, and RMS) with 100% accuracy.
15 genes selected from the colon cancer microarray data set (62 samples) using our method.
| Gene Accession # | Definition | |
| H20709 | < 0.000001 | myosin light chain alkali, smooth-muscle isoform |
| X57351 | < 0.000001 | interferon-inducible protein 1-8D |
| T94579 | < 0.000001 | human chitotriosidase precursor |
| T47377 | < 0.000001 | S-100P protein (human) |
| T98835 | < 0.000001 | alpha trans-inducing protein (bovine herpesvirus type 1) |
| T61661 | 3.0 × 10-5 | profilin I (human) |
| X67325 | 3.0 × 10-5 | H. sapiens p27 |
| T58861 | 0.02 | 60s ribosomal protein L30E |
| T61446 | 0.02 | putative DNA binding protein A20 |
| H88360 | 0.02 | guanine nucleotide-binding protein G(OLF), alpha subunit |
| L38810 | 0.02 | Homo sapiens thyroid receptor interactor (TRIP1) |
| T57882 | 0.02 | myosin heavy chain, nonmuscle type A |
| T92451 | 0.02 | tropomyosin, fibroblast and epithelial muscle-type |
| J02854 | 0.02 | myosin regulatory light chain 2, smooth muscle isoform |
| K03474 | 0.02 | human mullerian inhibiting substance gene |
Figure 2The gene expression map of the 15 genes selected from the colon cancer microarray data set using our method. Two major sample clusters can be recognized by visual inspection, corresponding to normal and cancer tissue samples, respectively.
Diagnosis results of the colon cancer data samples based on 15 selected genes, in correspondence with the gene expression map.
| Normal Tissue | Cancer Tissue | ||
| Normal-01 | normal | Cancer-01 | cancer |
| Normal-02 | normal | Cancer-02 | normal |
| Normal-03 | normal | Cancer-03 | cancer |
| Normal-04 | normal | Cancer-04 | cancer |
| Normal-05 | normal | Cancer-05 | cancer |
| Normal-06 | normal | Cancer-06 | cancer |
| Normal-07 | normal | Cancer-07 | cancer |
| Normal-08 | cancer | Cancer-08 | cancer |
| Normal-09 | normal | Cancer-09 | cancer |
| Normal-10 | normal | Cancer-10 | cancer |
| Normal-11 | normal | Cancer-11 | cancer |
| Normal-12 | normal | Cancer-12 | cancer |
| Normal-13 | normal | Cancer-13 | cancer |
| Normal-14 | normal | Cancer-14 | cancer |
| Normal-15 | normal | Cancer-15 | cancer |
| Normal-16 | normal | Cancer-16 | cancer |
| Normal-17 | normal | Cancer-17 | cancer |
| Normal-18 | normal | Cancer-18 | cancer |
| Normal-19 | normal | Cancer-19 | cancer |
| Normal-20 | cancer | Cancer-20 | cancer |
| Normal-21 | normal | Cancer-21 | cancer |
| Normal-22 | normal | Cancer-22 | cancer |
| Cancer-23 | cancer | ||
| Cancer-24 | cancer | ||
| Cancer-25 | cancer | ||
| Cancer-26 | cancer | ||
| Cancer-27 | cancer | ||
| Cancer-28 | normal | ||
| Cancer-29 | cancer | ||
| Cancer-30 | normal | ||
| Cancer-31 | cancer | ||
| Cancer-32 | cancer | ||
| Cancer-33 | cancer | ||
| Cancer-34 | cancer | ||
| Cancer-35 | cancer | ||
| Cancer-36 | normal | ||
| Cancer-37 | cancer | ||
| Cancer-38 | cancer | ||
| Cancer-39 | cancer | ||
| Cancer-40 | cancer |
Genes selected by our method on the leukemia microarray dataset. Those genes also selected using the methods of Golub et al.[1] and SVM-RFE (the reference algorithm) are respectively marked by the symbol •.
| Access Number | Gene Description | Golub et al. | SVM-RFE | |
| M27891 | < 0.000001 | CST3 Cystatin C | • | • |
| Y00787 | < 0.000001 | INTERLEUKIN-8 PRECURSOR | • | • |
| M19507 | 0.006 | MPO Myeloperoxidase | • | |
| L20688 | 0.006 | Ly-GDI |