| Literature DB >> 34354339 |
Yichen Cheng1, Xinlei Wang2, Yusen Xia1.
Abstract
We propose a novel supervised dimension reduction method, called supervised t-distributed stochastic neighbor embedding (St-SNE), which achieves dimension reduction by preserving the similarities of data points in both feature and outcome spaces. The proposed method can be used for both prediction and visualization tasks, with the ability to handle high-dimensional data. We show through a variety of datasets that when compared with a comprehensive list of existing methods, St-SNE has superior prediction performance in the ultra-high dimensional setting where the number of features p exceeds the sample size n, and has competitive performance in the p ≤ n setting. We also show that St-SNE is a competitive visualization tool that is capable of capturing within cluster variations. In addition, we propose a penalized Kullback-Leibler divergence criterion to automatically select the reduced dimension size k for St-SNE.Entities:
Keywords: classification; dimension size estimation; supervised dimension reduction; ultra-high dimension; visualization
Year: 2020 PMID: 34354339 PMCID: PMC8330414 DOI: 10.1287/ijoc.2020.0961
Source DB: PubMed Journal: INFORMS J Comput ISSN: 1091-9856 Impact factor: 2.276