| Literature DB >> 33475958 |
Hai-Yun Wang1, Jian-Ping Zhao2,3, Chun-Hou Zheng4,5.
Abstract
Clustering is a common method to identify cell types in single cell analysis, but the increasing size of scRNA-seq datasets brings challenges to single cell clustering. Therefore, it is an urgent need to design a faster and more accurate clustering method for large-scale scRNA-seq data. In this paper, we proposed a new method for single cell clustering. First, a count matrix is constructed through normalization and gene filtration. Second, the raw data of gene expression matrix are projected to feature space constructed by secondary construction of feature space based on UMAP (Uniform Manifold Approximation and Projection). Third, the low-dimensional matrix on the feature space is randomly divided into two sub-matrices according to a certain proportion for clustering and classifying, respectively. Finally, one subset is clustered by k-means algorithm and then the other subset is classified by k-nearest neighbor algorithm based on clustering results. Experimental results show that our method can cluster the scRNA-seq datasets effectively.Keywords: Classifying; Clustering; Spearman Correlation; Uniform Manifold Approximation and Projection; scRNA-seq
Year: 2021 PMID: 33475958 DOI: 10.1007/s12539-020-00411-6
Source DB: PubMed Journal: Interdiscip Sci ISSN: 1867-1462 Impact factor: 2.233