Literature DB >> 26441455

Space Structure and Clustering of Categorical Data.

Yuhua Qian, Feijiang Li, Jiye Liang, Bing Liu, Chuangyin Dang.   

Abstract

Learning from categorical data plays a fundamental role in such areas as pattern recognition, machine learning, data mining, and knowledge discovery. To effectively discover the group structure inherent in a set of categorical objects, many categorical clustering algorithms have been developed in the literature, among which k -modes-type algorithms are very representative because of their good performance. Nevertheless, there is still much room for improving their clustering performance in comparison with the clustering algorithms for the numeric data. This may arise from the fact that the categorical data lack a clear space structure as that of the numeric data. To address this issue, we propose, in this paper, a novel data-representation scheme for the categorical data, which maps a set of categorical objects into a Euclidean space. Based on the data-representation scheme, a general framework for space structure based categorical clustering algorithms (SBC) is designed. This framework together with the applications of two kinds of dissimilarities leads two versions of the SBC-type algorithms. To verify the performance of the SBC-type algorithms, we employ as references four representative algorithms of the k -modes-type algorithms. Experiments show that the proposed SBC-type algorithms significantly outperform the k -modes-type algorithms.

Entities:  

Year:  2015        PMID: 26441455     DOI: 10.1109/TNNLS.2015.2451151

Source DB:  PubMed          Journal:  IEEE Trans Neural Netw Learn Syst        ISSN: 2162-237X            Impact factor:   10.451


  1 in total

1.  Compressed kNN: K-Nearest Neighbors with Data Compression.

Authors:  Jaime Salvador-Meneses; Zoila Ruiz-Chavez; Jose Garcia-Rodriguez
Journal:  Entropy (Basel)       Date:  2019-02-28       Impact factor: 2.524

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.