| Literature DB >> 21269461 |
Azadeh Mohammadi1, Mohammad H Saraee, Mansoor Salehi.
Abstract
BACKGROUND: One of the best and most accurate methods for identifying disease-causing genes is monitoring gene expression values in different samples using microarray technology. One of the shortcomings of microarray data is that they provide a small quantity of samples with respect to the number of genes. This problem reduces the classification accuracy of the methods, so gene selection is essential to improve the predictive accuracy and to identify potential marker genes for a disease. Among numerous existing methods for gene selection, support vector machine-based recursive feature elimination (SVMRFE) has become one of the leading methods, but its performance can be reduced because of the small sample size, noisy data and the fact that the method does not remove redundant genes.Entities:
Mesh:
Year: 2011 PMID: 21269461 PMCID: PMC3037837 DOI: 10.1186/1755-8794-4-12
Source DB: PubMed Journal: BMC Med Genomics ISSN: 1755-8794 Impact factor: 3.063
Comparison of different gene selection methods based on accuracy
| Methods | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Colon | 86.1 | 86.8 | 89.9 | 91.6 | 89.7 | 91.8 | 93.3 | 91.2 | 93.2 | 94.7 |
| DLBCL | 89.2 | 89.6 | 91.9 | 93.7 | 92.5 | 94.4 | 95.8 | 93.6 | 95.4 | 96.8 |
| Prostate | 80.7 | 91.1 | 92.8 | 93.1 | 92.2 | 93.8 | 94.2 | 93.8 | 95.1 | 95.9 |
M1 = No-sel: classification without gene selection
M2 = Fisher: classification after using Fisher criteria for gene selection
M3 = Fisher-R: classification after using Fisher criteria and redundancy reduction greedy approach for gene selection
M4 = Fisher-RG: classification after using Fisher criteria and redundancy reduction greedy approach considering Gene Ontology information for gene selection
M5 = SVMRFE: classification after using SVMRFE algorithm
M6 = SVMRFE-R: classification after using SVMRFE algorithm and redundancy reduction greedy approach for gene selection
M7 = SVMRFE-RG: classification after using SVMRFE algorithm and redundancy reduction greedy approach considering Gene Ontology information for gene selection
M8 = Fisher-SVMRFE: classification after using combination of Fisher criteria and SVMRFE algorithm for gene selection
M9 = Fisher-R-SVMRFE: classification after using proposed framework without considering Gene Ontology for gene selection
M10 = Fisher-RG-SVMRFE: classification after using proposed framework for gene selection
Comparison of different gene selection methods based on sensitivity
| Methods | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Colon | 82.1 | 81.9 | 84.9 | 86.6 | 84.5 | 86.8 | 88.4 | 86.4 | 88.5 | 90.1 |
| DLBCL | 85.3 | 83.4 | 87.2 | 89.0 | 87.1 | 89.2 | 91.0 | 88.3 | 91.1 | 92.6 |
| Prostate | 82.1 | 83.1 | 88.9 | 89.4 | 88.4 | 91.7 | 92.3 | 90.1 | 92.4 | 93.5 |
Comparison of different gene selection methods based on specificity
| Methods | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Colon | 80.2 | 78.9 | 81.8 | 83.6 | 81.6 | 83.7 | 85.3 | 83.4 | 85.4 | 87.0 |
| DLBCL | 83.4 | 82.8 | 85.7 | 88.7 | 85.5 | 87.9 | 90.7 | 87.3 | 89.4 | 91.5 |
| Prostate | 81.8 | 82.9 | 88.8 | 89.1 | 88.6 | 92.3 | 93.1 | 89.9 | 93.2 | 94.1 |
Number of selected genes in different gene selection methods
| Methods | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Colon | 2000 | 67 | 27 | 23 | 27 | 25 | 23 | 26 | 15 | 12 |
| DLBCL | 4026 | 56 | 29 | 28 | 36 | 25 | 23 | 27 | 18 | 15 |
| Prostate | 12600 | 102 | 32 | 28 | 82 | 31 | 27 | 56 | 21 | 14 |
The reported results for colon cancer dataset in some papers
| Ref | [ | [ | [ | [ | [ | [ | [ |
|---|---|---|---|---|---|---|---|
| Accuracy | 98.0 | 91.9 | 93.0 | 92.0 | 93.6 | 90.3 | 88.8 |
| Number of selected genes | 4 | 3 | 15 | - | 10 | - | 10 |
The reported results for DLBCL cancer dataset in some papers
| Ref | [ | [ | [ | [ |
|---|---|---|---|---|
| Accuracy | 93.8 | 99.0 | 95.0 | 93.0 |
| Number of selected genes | 10 | 7 | 5 | 14 |
List of selected genes for colon cancer using proposed framework
| Genes name |
|---|
| Human putative NDP kinase (nm 23 - H2S) mRNA + |
| Human p58 natural killer cell receptor precursor mRNA |
| Human cysteine - rich protein (CRP) gene, exons 5 and 6* |
| Complement factor D precursor (H. sapiens) + * |
| Human cell adhesion molecule (CD44) mRNA* |
| H.sapiens mRNA for GCAP - II/uroguanylin precursor* |
| Collagen alpha 2(XI) chain (H. sapiens) +* |
| INTEGRIN ALPHA - 6 PRECURSOR (Homo sapiens) |
| Human desm in gene +* |
| Myosin regulatory light chain 2 +* |
| TRANSCRIPTION FACTOR ATF - A AND ATF - A - DELTA (Homo sapiens) |
| LEUKOCYTE ANTIGEN CD37 (Homo sapiens) + |
List of selected genes for DLBCL cancer using proposed framework
| Genes name |
|---|
| JAW1 = lymphoid - restricted membrane protein* |
| E2F - 3 = pRB - binding transcription factor + |
| erk3 = extracellular signal - regulated kinase 3 |
| JNK3 = Stress - activated protein kinase + |
| Unknown UG Hs. 120716 ESTs* |
| Unknown UG Hs. 136345 ESTs* |
| myosin - IC |
| receptor r - 1 BB lig and |
| Unknown UG Hs. 105261 EST* |
| Receptor protein - tyrosine kinase |
| thymosin beta - 4 |
| Unknown UG Hs. 169565 ESTs* |
| Id1 = Inhibitor of DNA binding 1, dominant negative helix - loop - helix protein |
| Unknown UG Hs. 124922 ESTs* |
| Unknown Hs. 33431 ESTs |