| Literature DB >> 18973864 |
Wei Xiong1, Zhibin Cai, Jinwen Ma.
Abstract
Microarray data based tumor diagnosis is a very interesting topic in bioinformatics. One of the key problems is the discovery and analysis of informative genes of a tumor. Although there are many elaborate approaches to this problem, it is still difficult to select a reasonable set of informative genes for tumor diagnosis only with microarray data. In this paper, we classify the genes expressed through microarray data into a number of clusters via the distance sensitive rival penalized competitive learning (DSRPCL) algorithm and then detect the informative gene cluster or set with the help of support vector machine (SVM). Moreover, the critical or powerful informative genes can be found through further classifications and detections on the obtained informative gene clusters. It is well demonstrated by experiments on the colon, leukemia, and breast cancer datasets that our proposed DSRPCL-SVM approach leads to a reasonable selection of informative genes for tumor diagnosis.Entities:
Mesh:
Year: 2008 PMID: 18973864 PMCID: PMC5054100 DOI: 10.1016/S1672-0229(08)60023-6
Source DB: PubMed Journal: Genomics Proteomics Bioinformatics ISSN: 1672-0229 Impact factor: 7.691
The DSRPCL algorithm and its variants
Randomly initialize the vector Update Batch DSRPCL: DSRPCL1: DSRPCL2: SARPCL: Let Randomly select If If If If | |
Experimental results of the first DSRPCL-SVM procedure on three datasets
| Dataset | Gene cluster | No. of genes | Classfication accuracy | ||
|---|---|---|---|---|---|
| Linear SVM | Poly SVM | RBF SVM | |||
| Colon cancer | 1 | 381 | 0.8889 | 0.9444 | 0.3889 |
| 2 | 182 | 0.7778 | 0.5556 | 0.3889 | |
| 3 | 385 | 0.8889 | 0.8889 | 0.5000 | |
| 4 (optimal) | 435 | 0.9444 | 0.8889 | 1 | |
| 5 | 418 | 0.8333 | 0.9444 | 0.8889 | |
| Leukemia | 1 | 2,769 | 0.9545 | 0.9545 | 0.8182 |
| 2 | 1,939 | 0.8182 | 0.7727 | 0.7273 | |
| 3 (optimal) | 1,708 | 0.9545 | 0.9545 | 0.9545 | |
| 4 | 1 | 0.1818 | 0 | 0 | |
| Breast cancer | 1 | 3 | 0.6667 | 0.5556 | 0.5185 |
| 2 | 2 | 0.4444 | 0.5556 | 0.5185 | |
| 3 (optimal) | 1,580 | 1 | 1 | 1 | |
| 4 | 926 | 1 | 1 | 1 | |
| 5 | 771 | 1 | 1 | 1 | |
| 6 | 486 | 1 | 1 | 1 | |
| 7 | 175 | 1 | 1 | 1 | |
| 8 | 672 | 1 | 1 | 1 | |
| 9 | 584 | 1 | 1 | 1 | |
Experimental results of the second DSRPCL-SVM procedure on three datasets
| Dataset | Gene cluster | No. of genes | Classfication accuracy | ||
|---|---|---|---|---|---|
| Linear SVM | Poly SVM | RBF SVM | |||
| Colon cancer | 1 | 157 | 0.5556 | 0.0556 | 0.8333 |
| 2 (optimal) | 145 | 0.8889 | 0.7222 | 0.8889 | |
| 3 | 37 | 0.6111 | 0 | 0.8333 | |
| 4 | 96 | 0.9444 | 0.2222 | 0.8333 | |
| Leukemia | 1 (optimal) | 959 | 0.9545 | 0.9545 | 0.9545 |
| 2 | 747 | 0.9091 | 0.8636 | 0.9091 | |
| 3 | 1 | 0.3636 | 0.3182 | 0.3182 | |
| 4 | 1 | 0.8636 | 0.7727 | 0.7727 | |
| Breast cancer | 1 | 70 | 1 | 0.9259 | 0.5185 |
| 2 | 152 | 1 | 1 | 0.6296 | |
| 3 | 4 | 0.5926 | 0.5185 | 0.5185 | |
| 4 | 93 | 1 | 1 | 0.6296 | |
| 5 | 187 | 1 | 1 | 0.5926 | |
| 6 | 82 | 1 | 1 | 0.5185 | |
| 7 (optimal) | 632 | 1 | 1 | 0.8148 | |
| 8 | 62 | 1 | 1 | 0.5185 | |
| 9 | 298 | 1 | 1 | 0.6296 | |
Experimental results of the successive DSRPCL-SVM procedures on three datasets
| Dataset | Subdivision | Highest accuracy | Size of optimal gene cluster |
|---|---|---|---|
| Colon cancer | 1 | 1 | 435 |
| 2 | 0.8889 | 145 | |
| 3 | 0.7778 | 61 | |
| 4 | 0.8333 | 24 | |
| 5 | 0.7222 | 6 | |
| 6 | 0.7222 | 6 | |
| Leukemia | 1 | 0.9545 | 1,708 |
| 2 | 0.9545 | 959 | |
| 3 | 0.9545 | 479 | |
| 4 | 0.9545 | 479 | |
| 5 | 0.9545 | 271 | |
| 6 | 0.9091 | 104 | |
| 7 | 0.9091 | 31 | |
| 8 | 0.9091 | 31 | |
| 9 | 0.8636 | 5 | |
| 10 | 0.8636 | 5 | |
| Breast cancer | 1 | 1 | 1,580 |
| 2 | 1 | 632 | |
| 3 | 1 | 94 | |
| 4 | 0.8519 | 25 | |
| 5 | 0.7037 | 11 | |
| 6 | 0.7037 | 11 | |
| 7 | 0.6296 | 4 | |
| 8 | 0.6296 | 4 | |
Identity numbers of the powerful genes for the three datasets
| Dataset | Powerful gene ID No. |
|---|---|
| Colon cancer | 211, 1215, 1394, 1621, 1858, 1865 |
| Leukemia | 331, 569, 787, 2281, 4586 |
| Breast cancer | 383, 385, 5294, 5797 |