Literature DB >> 28879568

A Model-Based Approach to Simultaneous Clustering and Dimensional Reduction of Ordinal Data.

Monia Ranalli1, Roberto Rocci2.   

Abstract

The literature on clustering for continuous data is rich and wide; differently, that one developed for categorical data is still limited. In some cases, the clustering problem is made more difficult by the presence of noise variables/dimensions that do not contain information about the clustering structure and could mask it. The aim of this paper is to propose a model for simultaneous clustering and dimensionality reduction of ordered categorical data able to detect the discriminative dimensions discarding the noise ones. Following the underlying response variable approach, the observed variables are considered as a discretization of underlying first-order latent continuous variables distributed as a Gaussian mixture. To recognize discriminative and noise dimensions, these variables are considered to be linear combinations of two independent sets of second-order latent variables where only one contains the information about the cluster structure while the other one contains noise dimensions. The model specification involves multidimensional integrals that make the maximum likelihood estimation cumbersome and in some cases infeasible. To overcome this issue, the parameter estimation is carried out through an EM-like algorithm maximizing a composite log-likelihood based on low-dimensional margins. Examples of application of the proposal on real and simulated data are performed to show the effectiveness of the proposal.

Keywords:  composite likelihood; mixture models; reduction ordinal data

Mesh:

Year:  2017        PMID: 28879568     DOI: 10.1007/s11336-017-9578-5

Source DB:  PubMed          Journal:  Psychometrika        ISSN: 0033-3123            Impact factor:   2.500


  9 in total

1.  Factor Analysis of Ordinal Variables: A Comparison of Three Approaches.

Authors:  K G Jöreskog; I Moustaki
Journal:  Multivariate Behav Res       Date:  2001-07-01       Impact factor: 5.923

2.  Modeling the manifolds of images of handwritten digits.

Authors:  G E Hinton; P Dayan; M Revow
Journal:  IEEE Trans Neural Netw       Date:  1997

3.  Variable selection for clustering with Gaussian mixture models.

Authors:  Cathy Maugis; Gilles Celeux; Marie-Laure Martin-Magniette
Journal:  Biometrics       Date:  2009-02-04       Impact factor: 2.571

4.  Mixtures of probabilistic principal component analyzers.

Authors:  M E Tipping; C M Bishop
Journal:  Neural Comput       Date:  1999-02-15       Impact factor: 2.026

5.  Pairwise Likelihood Ratio Tests and Model Selection Criteria for Structural Equation Models with Ordinal Variables.

Authors:  Myrsini Katsikatsou; Irini Moustaki
Journal:  Psychometrika       Date:  2016-10-12       Impact factor: 2.500

6.  A framework for feature selection in clustering.

Authors:  Daniela M Witten; Robert Tibshirani
Journal:  J Am Stat Assoc       Date:  2010-06-01       Impact factor: 5.033

7.  Latent Class Analysis Variable Selection.

Authors:  Nema Dean; Adrian E Raftery
Journal:  Ann Inst Stat Math       Date:  2010-02-01       Impact factor: 1.267

8.  Distinguishing between latent classes and continuous factors with categorical outcomes: Class invariance of parameters of factor mixture models.

Authors:  Gitta Lubke; Michael Neale
Journal:  Multivariate Behav Res       Date:  2008-10       Impact factor: 5.923

9.  CLUSTERING SOUTH AFRICAN HOUSEHOLDS BASED ON THEIR ASSET STATUS USING LATENT VARIABLE MODELS.

Authors:  Damien McParland; Isobel Claire Gormley; Tyler H McCormick; Samuel J Clark; Chodziwadziwa Whiteson Kabudula; Mark A Collinson
Journal:  Ann Appl Stat       Date:  2014-06-01       Impact factor: 2.083

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.