| Literature DB >> 19458767 |
Edward R Dougherty1, Marcel Brun.
Abstract
The issue of wide feature-set variability has recently been raised in the context of expression-based classification using microarray data. This paper addresses this concern by demonstrating the natural manner in which many feature sets of a certain size chosen from a large collection of potential features can be so close to being optimal that they are statistically indistinguishable. Feature-set optimality is inherently related to sample size because it only arises on account of the tendency for diminished classifier accuracy as the number of features grows too large for satisfactory design from the sample data. The paper considers optimal feature sets in the framework of a model in which the features are grouped in such a way that intra-group correlation is substantial whereas inter-group correlation is minimal, the intent being to model the situation in which there are groups of highly correlated co-regulated genes and there is little correlation between the co-regulated groups. This is accomplished by using a block model for the covariance matrix that reflects these conditions. Focusing on linear discriminant analysis, we demonstrate how these assumptions can lead to very large numbers of close-to-optimal feature sets.Entities:
Keywords: classification; covariance model; feature sets; gene expression
Year: 2007 PMID: 19458767 PMCID: PMC2675502
Source DB: PubMed Journal: Cancer Inform ISSN: 1176-9351
Figure 1.Percentage of separated optimal feature sets as a function of the correlation coefficient ρ. First row: K ∼ Cov[9, 3], (a) ρ = ρ, σ ∼ U[0.3, 0.47]; (b) ρ = ρ, σ ∼ U[0.3, 1]; (c) ρ = ρ||, σ∼U[0.3, 0.47]; (d) ρ = ρ||, σ∼ U[0.3, 1]. Second row: K ∼ Cov[10, 2], (e) ρ = ρ, σ ∼ U[0.3, 0.47]; (f) ρ = ρ, σ ∼ U[0.3, 1]; (g) ρ = ρ||, σ ∼ U[0.3, 0.47]; (h) ρ = ρ||, σ ∼ U[0.3, 1]. Third row: K ∼ Cov[10, 5], (i) ρ = ρ, σ ∼ U[0.3, 0.47]; (j) ρ = ρ, σ ∼ U[0.3, 1]; (k) ρ = ρ||, σ ∼ U[0.3, 0.47]; (l) ρ = ρ||σ ∼ U[0.3, 1].
Figure 2.Scatter plots for model K ∼ Cov[10, 2],ρ = ρ,σ ∼ U [0.3, 0.65], with ρ = 0.999: (a) feature set {X1, X}; (b) feature set {X6, X8}.
Figure 3.Scatter plots for model K ∼ Cov[10, 2], ρ = ρ, σ ∼ U[0.3, 0.65], with ρ = 0.5: (a) feature set {X1, X7}; (b) feature set {X6, X8}.
Figure 4.Cumulative histograms of superior feature-set errors: (a) low Bayes error; (b) high Bayes error.