| Literature DB >> 24324960 |
Nicoletta Dessì1, Emanuele Pascariello, Barbara Pes.
Abstract
Feature selection has become the essential step in biomarker discovery from high-dimensional genomics data. It is recognized that different feature selection techniques may result in different set of biomarkers, that is, different groups of genes highly correlated to a given pathological condition, but few direct comparisons exist which quantify these differences in a systematic way. In this paper, we propose a general methodology for comparing the outcomes of different selection techniques in the context of biomarker discovery. The comparison is carried out along two dimensions: (i) measuring the similarity/dissimilarity of selected gene sets; (ii) evaluating the implications of these differences in terms of both predictive performance and stability of selected gene sets. As a case study, we considered three benchmarks deriving from DNA microarray experiments and conducted a comparative analysis among eight selection methods, representatives of different classes of feature selection techniques. Our results show that the proposed approach can provide useful insight about the pattern of agreement of biomarker discovery techniques.Entities:
Mesh:
Substances:
Year: 2013 PMID: 24324960 PMCID: PMC3842054 DOI: 10.1155/2013/387673
Source DB: PubMed Journal: Biomed Res Int Impact factor: 3.411
Figure 1Similarity evaluation.
Figure 2Joint evaluation of stability and predictive performance.
Figure 3Colon dataset: stability versus number of genes.
Figure 4Leukemia dataset: stability versus number of genes.
Figure 5Prostate dataset: stability versus number of genes.
Figure 6Colon dataset: AUC versus number of genes.
Figure 7Leukemia dataset: AUC versus number of genes.
Figure 8Prostate dataset: AUC versus number of genes.
(a) Colon dataset
|
|
(b) Leukemia dataset
|
|
(c) Prostate dataset
|
|
(a) Colon dataset
|
|
(b) Leukemia dataset
|
|
(c) Prostate dataset
|
|