| Literature DB >> 23476131 |
Li Li1, Hongmei Chen, Chang Liu, Fang Wang, Fangfang Zhang, Lihua Bai, Yihan Chen, Luying Peng.
Abstract
Microarray data are high dimension with high noise ratio and relatively small sample size, which makes it a challenge to use microarray data to identify candidate disease genes. Here, we have presented a hybrid method that combines estimation of distribution algorithm with support vector machine for selection of key feature genes. We have benchmarked the method using the microarray data of both diffuse B cell lymphoma and colon cancer to demonstrate its performance for identifying key features from the profile data of high-dimension gene expression. The method was compared with a probabilistic model based on genetic algorithm and another hybrid method based on both genetics algorithm and support vector machine. The results showed that the proposed method provides new computational strategy for hunting candidate disease genes from the profile data of disease gene expression. The selected candidate disease genes may help to improve the diagnosis and treatment for diseases.Entities:
Mesh:
Year: 2013 PMID: 23476131 PMCID: PMC3582165 DOI: 10.1155/2013/393570
Source DB: PubMed Journal: ScientificWorldJournal ISSN: 1537-744X
Figure 1The main flow of EDA-SVM algorithm. M, D, G, and eval denote gene expression profile matrix, population, gene subset, and evaluation index, respectively.
Algorithm 1The step-by-step recipe for the computational algorithm of the EDA-SVM approach.
Figure 2The changes of accuracy of the SVM classifier (a) and the changes of support vectors (b) over iterations in EDA-SVM, GA-SVM, and PMBGA based on DLBCL data set.
Figure 3The changes of accuracy of the SVM classifier (a) and the changes of support vectors (b) over iterations in EDA-SVM, GA-SVM, and PMBGA based on colon data set.
The GO annotations of EDA-SVM feature genes.
| Gene name | Biological process | Cellular component | Molecular function |
|---|---|---|---|
| SPIB | GO:0006350 Transcription | GO:0005634 Nucleus | GO:0003700 Transcription factor activity |
| GO:0006357 Regulation of transcription from RNA polymerase II promoter | GO:0005737 Cytoplasm | GO:0003702: RNA polymerase II transcription factor activity | |
|
| |||
| IRF8 | GO:0000122 Negative regulation of transcription from RNA polymerase II promoter | GO:0005634 Nucleus | GO:0003705: RNA polymerase II transcription factor activity, enhancer binding |
| GO:0006355 Regulation of transcription, DNA-dependent | |||
| GO:0006350 Transcription | |||
| GO:0006955 Immune response | |||
|
| |||
| NFKB2 | Go:0006355 Regulation of transcription, DNA-dependent | GO:0005634 Nucleus | GO:0005515 Protein binding |
| GO:0005737 Cytoplasm | GO:0003713 Transcription coactivator activity | ||
| GO:0007165 Signal transduction | GO:0003700 Transcription factor activity | ||
|
| |||
| LMO2 | GO:0008270 Development | GO:0005634 Nucleus | GO:0008270 Zinc ion binding |
| GO:0005515 Protein binding | |||
| GO:0046872 Metal ion binding | |||
|
| |||
| FCGRT | GO:0019882 Antigen presentation | GO:0042612 MHC class I protein complex | GO:0019864 IgG binding |
| GO:0007565 Pregnancy | GO:0016021 Integral to membrane | GO:0004872 Receptor activity | |
| GO:0006955 Immune response | GO:0030106 MHC class I receptor activity | ||
|
| |||
| BCL7B | Unknown | Unknown | GO:0003779 Actin binding |