| Literature DB >> 32730208 |
Heng Tao Shen, Yonghua Zhu, Wei Zheng, Xiaofeng Zhu.
Abstract
Unsupervised feature selection (UFS) is a popular technique of reducing the dimensions of high-dimensional data. Previous UFS methods were often designed with the assumption that the whole information in the data set is observed. However, incomplete data sets that contain unobserved information can be often found in real applications, especially in industry. Thus, these existing UFS methods have a limitation on conducting feature selection on incomplete data. On the other hand, most existing UFS methods did not consider the sample importance for feature selection, i.e., different samples have various importance. As a result, the constructed UFS models easily suffer from the influence of outliers. This article investigates a new UFS method for conducting UFS on incomplete data sets to investigate the abovementioned issues. Specifically, the proposed method deals with unobserved information by using an indicator matrix to filter it out the process of feature selection and reduces the influence of outliers by employing the half-quadratic minimization technique to automatically assigning outliers with small or even zero weights and important samples with large weights. This article further designs an alternative optimization strategy to optimize the proposed objective function as well as theoretically and experimentally prove the convergence of the proposed optimization strategy. Experimental results on both real and synthetic incomplete data sets verified the effectiveness of the proposed method compared with previous methods, in terms of clustering performance on the low-dimensional space of the high-dimensional data.Year: 2021 PMID: 32730208 DOI: 10.1109/TNNLS.2020.3009632
Source DB: PubMed Journal: IEEE Trans Neural Netw Learn Syst ISSN: 2162-237X Impact factor: 10.451