| Literature DB >> 19912170 |
Jian Guo1, Elizaveta Levina, George Michailidis, Ji Zhu.
Abstract
Variable selection for clustering is an important and challenging problem in high-dimensional data analysis. Existing variable selection methods for model-based clustering select informative variables in a "one-in-all-out" manner; that is, a variable is selected if at least one pair of clusters is separable by this variable and removed if it cannot separate any of the clusters. In many applications, however, it is of interest to further establish exactly which clusters are separable by each informative variable. To address this question, we propose a pairwise variable selection method for high-dimensional model-based clustering. The method is based on a new pairwise penalty. Results on simulated and real data show that the new method performs better than alternative approaches that use ℓ(1) and ℓ(∞) penalties and offers better interpretation.Entities:
Mesh:
Year: 2010 PMID: 19912170 PMCID: PMC2888949 DOI: 10.1111/j.1541-0420.2009.01341.x
Source DB: PubMed Journal: Biometrics ISSN: 0006-341X Impact factor: 2.571