| Literature DB >> 27879930 |
Abstract
Genomic microarrays are powerful research tools in bioinformatics and modern medicinal research because they enable massively-parallel assays and simultaneous monitoring of thousands of gene expression of biological samples. However, a simple microarray experiment often leads to very high-dimensional data and a huge amount of information, the vast amount of data challenges researchers into extracting the important features and reducing the high dimensionality. In this paper, a nonlinear dimensionality reduction kernel method based locally linear embedding(LLE) is proposed, and fuzzy K-nearest neighbors algorithm which denoises datasets will be introduced as a replacement to the classical LLE's KNN algorithm. In addition, kernel method based support vector machine (SVM) will be used to classify genomic microarray data sets in this paper. We demonstrate the application of the techniques to two published DNA microarray data sets. The experimental results confirm the superiority and high success rates of the presented method.Entities:
Keywords: Dimensionality reduction; Kernel methods; Locally linear embedding; Manifold learning; Support vector machine.
Year: 2008 PMID: 27879930 PMCID: PMC3697169 DOI: 10.3390/s8074186
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1.The experiments on the SRBCTs dataset: (a)The classifier accuracy of dimensionality reduction in 96 genes selected by Khan ;(b)The test error of dimensionality reduction in 96 genes; (c)The classifier accuracy of dimensionality reduction in 20 genes;(d)The test error of dimensionality reduction in 20 genes selected by Nikhil.
The comparison of three methods on SRBCT dataset
| Algorithms | 96 genes | 20 genes | ||||
|---|---|---|---|---|---|---|
|
| ||||||
| Dimensional | Support vectors | Time(sec) | Dimensional | Support vectors | Time(sec) | |
| SVM | 96 | - | - | 20 | 106 | 4127 |
| PCA-SVM | 29 | 87 | 2672 | 11 | 64 | 1933 |
| LLE-SVM | 14 | 63 | 2102 | 9 | 42 | 1743 |
| KLLE-SVM | 7 | 42 | 1934 | 5 | 31 | 1307 |
Figure 2.The experiments on the lymphoma dataset: (a)The classifier accuracy of dimensionality reduction in 165 genes selected by T-score;(b)The test error of dimensionality reduction in 165 genes; (c)The classifier accuracy of dimensionality reduction in 48 genes;(d)The test error of dimensionality reduction in 48 genes selected by nearest shrunken centroids
The comparison of three methods on lymphoma dataset
| Algorithms | 165 genes | 48 genes | ||||
|---|---|---|---|---|---|---|
|
| ||||||
| Dimensional | Support vectors | Time(sec) | Dimensional | Support vectors | Time(sec) | |
| SVM | 165 | - | - | 48 | 124 | 5343 |
| PCA-SVM | 18 | 104 | 2672 | 22 | 83 | 3105 |
| LLE-SVM | 15 | 74 | 2133 | 9 | 56 | 2247 |
| KLLE-SVM | 7 | 56 | 1934 | 5 | 41 | 1766 |