Literature DB >> 30450020

clustvarsel: A Package Implementing Variable Selection for Gaussian Model-Based Clustering in R.

Luca Scrucca1, Adrian E Raftery2.   

Abstract

Finite mixture modeling provides a framework for cluster analysis based on parsimonious Gaussian mixture models. Variable or feature selection is of particular importance in situations where only a subset of the available variables provide clustering information. This enables the selection of a more parsimonious model, yielding more efficient estimates, a clearer interpretation and, often, improved clustering partitions. This paper describes the R package clustvarsel which performs subset selection for model-based clustering. An improved version of the Raftery and Dean (2006) methodology is implemented in the new release of the package to find the (locally) optimal subset of variables with group/cluster information in a dataset. Search over the solution space is performed using either a step-wise greedy search or a headlong algorithm. Adjustments for speeding up these algorithms are discussed, as well as a parallel implementation of the stepwise search. Usage of the package is presented through the discussion of several data examples.

Entities:  

Keywords:  BIC; R; model-based clustering; subset selection

Year:  2018        PMID: 30450020      PMCID: PMC6238955          DOI: 10.18637/jss.v084.i01

Source DB:  PubMed          Journal:  J Stat Softw        ISSN: 1548-7660            Impact factor:   6.440


  6 in total

1.  Variable selection for clustering with Gaussian mixture models.

Authors:  Cathy Maugis; Gilles Celeux; Marie-Laure Martin-Magniette
Journal:  Biometrics       Date:  2009-02-04       Impact factor: 2.571

2.  A framework for feature selection in clustering.

Authors:  Daniela M Witten; Robert Tibshirani
Journal:  J Am Stat Assoc       Date:  2010-06-01       Impact factor: 5.033

3.  mclust 5: Clustering, Classification and Density Estimation Using Gaussian Finite Mixture Models.

Authors:  Luca Scrucca; Michael Fop; T Brendan Murphy; Adrian E Raftery
Journal:  R J       Date:  2016-08       Impact factor: 3.984

4.  Comparing Model Selection and Regularization Approaches to Variable Selection in Model-Based Clustering.

Authors:  Gilles Celeux; Marie-Laure Martin-Magniette; Cathy Maugis-Rabusseau; Adrian E Raftery
Journal:  J Soc Fr Statistique (2009)       Date:  2014

5.  Latent Class Analysis Variable Selection.

Authors:  Nema Dean; Adrian E Raftery
Journal:  Ann Inst Stat Math       Date:  2010-02-01       Impact factor: 1.267

6.  Improved initialisation of model-based clustering using Gaussian hierarchical partitions.

Authors:  Luca Scrucca; Adrian E Raftery
Journal:  Adv Data Anal Classif       Date:  2015-10-26
  6 in total
  5 in total

1.  Species richness promotes ecosystem carbon storage: evidence from biodiversity-ecosystem functioning experiments.

Authors:  Shan Xu; Nico Eisenhauer; Olga Ferlian; Jinlong Zhang; Guoyi Zhou; Xiankai Lu; Chengshuai Liu; Deqiang Zhang
Journal:  Proc Biol Sci       Date:  2020-11-25       Impact factor: 5.349

2.  Unobserved classes and extra variables in high-dimensional discriminant analysis.

Authors:  Michael Fop; Pierre-Alexandre Mattei; Charles Bouveyron; Thomas Brendan Murphy
Journal:  Adv Data Anal Classif       Date:  2022-03-01

3.  Association Between Inflammatory Pathways and Phenotypes of Pulmonary Dysfunction Using Cluster Analysis in Persons Living With HIV and HIV-Uninfected Individuals.

Authors:  Shulin Qin; Lena Vodovotz; Ruben Zamora; Meghan Fitzpatrick; Cathy Kessinger; Lawrence Kingsley; Deborah McMahon; Rebecca DeSensi; Joseph K Leader; Kristina Crothers; Laurence Huang; Alison Morris; Mehdi Nouraie
Journal:  J Acquir Immune Defic Syndr       Date:  2020-02-01       Impact factor: 3.771

4.  Disproportionate exposure to urban heat island intensity across major US cities.

Authors:  Angel Hsu; Glenn Sheriff; Tirthankar Chakraborty; Diego Manya
Journal:  Nat Commun       Date:  2021-05-25       Impact factor: 14.919

5.  Expression of Immune Checkpoint Regulators IDO, VISTA, LAG3, and TIM3 in Resected Pancreatic Ductal Adenocarcinoma.

Authors:  Felix C Popp; Ingracia Capino; Joana Bartels; Alexander Damanakis; Jiahui Li; Rabi R Datta; Heike Löser; Yue Zhao; Alexander Quaas; Philipp Lohneis; Christiane J Bruns
Journal:  Cancers (Basel)       Date:  2021-05-29       Impact factor: 6.639

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.