Chen Xu1, Zhengchang Su1. 1. Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC 28223, USA.
Abstract
MOTIVATION: The recent advance of single-cell technologies has brought new insights into complex biological phenomena. In particular, genome-wide single-cell measurements such as transcriptome sequencing enable the characterization of cellular composition as well as functional variation in homogenic cell populations. An important step in the single-cell transcriptome analysis is to group cells that belong to the same cell types based on gene expression patterns. The corresponding computational problem is to cluster a noisy high dimensional dataset with substantially fewer objects (cells) than the number of variables (genes). RESULTS: In this article, we describe a novel algorithm named shared nearest neighbor (SNN)-Cliq that clusters single-cell transcriptomes. SNN-Cliq utilizes the concept of shared nearest neighbor that shows advantages in handling high-dimensional data. When evaluated on a variety of synthetic and real experimental datasets, SNN-Cliq outperformed the state-of-the-art methods tested. More importantly, the clustering results of SNN-Cliq reflect the cell types or origins with high accuracy. AVAILABILITY AND IMPLEMENTATION: The algorithm is implemented in MATLAB and Python. The source code can be downloaded at http://bioinfo.uncc.edu/SNNCliq.
MOTIVATION: The recent advance of single-cell technologies has brought new insights into complex biological phenomena. In particular, genome-wide single-cell measurements such as transcriptome sequencing enable the characterization of cellular composition as well as functional variation in homogenic cell populations. An important step in the single-cell transcriptome analysis is to group cells that belong to the same cell types based on gene expression patterns. The corresponding computational problem is to cluster a noisy high dimensional dataset with substantially fewer objects (cells) than the number of variables (genes). RESULTS: In this article, we describe a novel algorithm named shared nearest neighbor (SNN)-Cliq that clusters single-cell transcriptomes. SNN-Cliq utilizes the concept of shared nearest neighbor that shows advantages in handling high-dimensional data. When evaluated on a variety of synthetic and real experimental datasets, SNN-Cliq outperformed the state-of-the-art methods tested. More importantly, the clustering results of SNN-Cliq reflect the cell types or origins with high accuracy. AVAILABILITY AND IMPLEMENTATION: The algorithm is implemented in MATLAB and Python. The source code can be downloaded at http://bioinfo.uncc.edu/SNNCliq.
Authors: Guoji Guo; Mikael Huss; Guo Qing Tong; Chaoyang Wang; Li Li Sun; Neil D Clarke; Paul Robson Journal: Dev Cell Date: 2010-04-20 Impact factor: 12.270
Authors: Daniel Ramsköld; Shujun Luo; Yu-Chieh Wang; Robin Li; Qiaolin Deng; Omid R Faridani; Gregory A Daniels; Irina Khrebtukova; Jeanne F Loring; Louise C Laurent; Gary P Schroth; Rickard Sandberg Journal: Nat Biotechnol Date: 2012-08 Impact factor: 54.908
Authors: Q Tian Wang; Karolina Piotrowska; Maria Anna Ciemerych; Ljiljana Milenkovic; Matthew P Scott; Ronald W Davis; Magdalena Zernicka-Goetz Journal: Dev Cell Date: 2004-01 Impact factor: 12.270
Authors: Christopher A Jackson; Dayanne M Castro; Richard Bonneau; David Gresham; Giuseppe-Antonio Saldi Journal: Elife Date: 2020-01-27 Impact factor: 8.140