Literature DB >> 26099264

Sincell: an R/Bioconductor package for statistical assessment of cell-state hierarchies from single-cell RNA-seq.

Miguel Juliá1, Amalio Telenti2, Antonio Rausell1.   

Abstract

UNLABELLED: Cell differentiation processes are achieved through a continuum of hierarchical intermediate cell states that might be captured by single-cell RNA seq. Existing computational approaches for the assessment of cell-state hierarchies from single-cell data can be formalized under a general framework composed of (i) a metric to assess cell-to-cell similarities (with or without a dimensionality reduction step) and (ii) a graph-building algorithm (optionally making use of a cell clustering step). The Sincell R package implements a methodological toolbox allowing flexible workflows under such a framework. Furthermore, Sincell contributes new algorithms to provide cell-state hierarchies with statistical support while accounting for stochastic factors in single-cell RNA seq. Graphical representations and functional association tests are provided to interpret hierarchies. The functionalities of Sincell are illustrated in a real case study, which demonstrates its ability to discriminate noisy from stable cell-state hierarchies.
AVAILABILITY AND IMPLEMENTATION: Sincell is an open-source R/Bioconductor package available at http://bioconductor.org/packages/sincell. A detailed manual and a vignette are provided with the package. CONTACT: antonio.rausell@isb-sib.ch SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author 2015. Published by Oxford University Press.

Entities:  

Mesh:

Year:  2015        PMID: 26099264      PMCID: PMC4595899          DOI: 10.1093/bioinformatics/btv368

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 Introduction

Unbiased profiling of individual cells through single-cell RNA-seq allows assessing heterogeneity of transcriptional states within a cell population (Wu ). In the context of a cell population’s differentiation or activation process, such transcriptional heterogeneity might reflect a continuum of intermediate cell states and lineages resulting from dynamic regulatory programs. Such a continuum might be captured through the assessment of cell-state hierarchies, where each cell is placed in a relative ordering in the transcriptional landscape. Additionally, statistical support should be provided to discriminate reliable hierarchies from stochastic heterogeneity, arising from both technical and biological factors. A number of algorithms have been used to assess cell-state hierarchies from single-cell data (Amir ; Bendall ; Buettner ; Jaitin ; Moignard ; Qiu ; Trapnell ). These approaches can be formalized under a general framework (Supplementary Table S1). Here we present Sincell, an R/Bioconductor package where the various building blocks of that general workflow are extended and combined (Fig. 1). Notably, Sincell implements algorithms to provide statistical support to the cell-state hierarchies derived from single-cell RNA-seq. The package is complemented with graphical representations and functional association tests to help interpret the results.
Fig. 1.

Overall workflow for the statistical assessment of cell-state hierarchies implemented by the Sincell R package. Dashed arrows correspond to optional steps in the analysis

Overall workflow for the statistical assessment of cell-state hierarchies implemented by the Sincell R package. Dashed arrows correspond to optional steps in the analysis

2 Description

As input, Sincell requires an expression matrix with user-defined normalized gene expression levels per single cell (Fig. 1). Variance stabilization is recommended (e.g. through log transformation). First, a cell-to-cell distance matrix is calculated using a metric of choice. Sincell provides both linear and non-linear distances: Euclidean, Mutual Information, L1 distance, Pearson and Spearman correlation. Optionally, the distance matrix may be obtained from the leading dimensions of a data reduction technique, performed to keep the most informative parts of the data while excluding noise. Both linear and non-linear algorithms are provided: Principal Component Analysis, Independent Component Analysis, t-Distributed Stochastic Neighbor Embedding and non-metric Multidimensional Scaling. Second, a cell-state hierarchy is obtained by applying a graph-building algorithm on the cell-to-cell distance matrix. Graph-building algorithms may consider cells both individually and in clusters of highly similar cells. Sincell provides different clustering methods (K-Mutual Nearest Neighbors, k-medoids, agglomerative clustering, etc.) and graph-building algorithms (MST, SST and IMC; Supplementary Text). Stochastic technical and biological factors may drive cell-state heterogeneity observed in single-cell RNA seq data. Additionally, hierarchies derived from experiments with a low number of individual cells (e.g. 96 cells when using a Fluidigm C1™ Single-Cell Auto Prep System) are more susceptible to noise artifacts than experiments profiling thousands of individual cells (e.g. flow cytometry data). Sincell implements two algorithms to discriminate reliable hierarchies from noise-driven ones. The first strategy relies on a gene resampling procedure. The second one is based on random cell substitution with in silico-generated cell replicates. These replicates are built by perturbing observed gene expression levels with random noise, following patterns of stochasticity observed in single-cell RNA-seq (Supplementary Text). Either approach generates a population of hierarchies whose similarities to the reference show the distribution of the hierarchy stability against changes in the data. To help interpret hierarchies in functional terms, Sincell provides graphical representations of cell-to-cell similarities in low-dimensional space as well as different graph displays of hierarchies, coloring cells, e.g. by expression levels of a marker of choice. Furthermore, Sincell implements an algorithm to determine the statistical significance of the association of the hierarchy with the expression levels of a given gene set (Supplementary Text). Gene set collections (e.g. Gene Ontology terms) can be systematically evaluated.

3 Application

The Sincell R package includes a detailed vignette illustrating all functionalities using real single-cell RNA-seq data. We use data from (Trapnell ) quantifying gene expression levels in differentiating myoblast at 0, 24, 48 and 72 h. Here, we analyze each time-point separately and evaluate the statistical evidence of cell-state heterogeneity within them (Supplementary Fig. 1). Our results show that early differentiation timepoints produce unstable hierarchies suggesting a low level of cell-state heterogeneity. However, late differentiation timepoints produce statistically significant hierarchies that reflect cell-state diversity along the differentiation process (Supplementary Text).

4 Discussion

The landscape of computational approaches to assess cell-state hierarchies from single-cell data is far from being fully explored. The diversity of biological studies and rapid evolution of single-cell technologies require a comprehensive toolbox where users may easily tailor workflows and compare alternative methods and assumptions. Furthermore, cell-state hierarchies should be statistically supported before being used as input in subsequent analyses. The Sincell R package addresses these needs by providing a general analysis framework, new algorithms for statistical support as well as tools for functional interpretation of cell-state hierarchies.

Funding

This research was supported by the European Union's Seventh Framework Programme FP7/2007-2013/ under grant agreement no. 305762 and the Swiss National Science Foundation grant no. 149724. Computations were performed at the Vital-IT (http://www.vital-it.ch) Center for high-performance computing of the SIB Swiss Institute of Bioinformatics. Conflict of interest: none declared.
  8 in total

1.  Single-cell trajectory detection uncovers progression and regulatory coordination in human B cell development.

Authors:  Sean C Bendall; Kara L Davis; El-Ad David Amir; Michelle D Tadmor; Erin F Simonds; Tiffany J Chen; Daniel K Shenfeld; Garry P Nolan; Dana Pe'er
Journal:  Cell       Date:  2014-04-24       Impact factor: 41.582

2.  Computational analysis of cell-to-cell heterogeneity in single-cell RNA-sequencing data reveals hidden subpopulations of cells.

Authors:  Florian Buettner; Kedar N Natarajan; F Paolo Casale; Valentina Proserpio; Antonio Scialdone; Fabian J Theis; Sarah A Teichmann; John C Marioni; Oliver Stegle
Journal:  Nat Biotechnol       Date:  2015-01-19       Impact factor: 54.908

3.  viSNE enables visualization of high dimensional single-cell data and reveals phenotypic heterogeneity of leukemia.

Authors:  El-ad David Amir; Kara L Davis; Michelle D Tadmor; Erin F Simonds; Jacob H Levine; Sean C Bendall; Daniel K Shenfeld; Smita Krishnaswamy; Garry P Nolan; Dana Pe'er
Journal:  Nat Biotechnol       Date:  2013-05-19       Impact factor: 54.908

4.  Quantitative assessment of single-cell RNA-sequencing methods.

Authors:  Angela R Wu; Norma F Neff; Tomer Kalisky; Piero Dalerba; Barbara Treutlein; Michael E Rothenberg; Francis M Mburu; Gary L Mantalas; Sopheak Sim; Michael F Clarke; Stephen R Quake
Journal:  Nat Methods       Date:  2013-10-20       Impact factor: 28.547

5.  Massively parallel single-cell RNA-seq for marker-free decomposition of tissues into cell types.

Authors:  Diego Adhemar Jaitin; Ephraim Kenigsberg; Hadas Keren-Shaul; Naama Elefant; Franziska Paul; Irina Zaretsky; Alexander Mildner; Nadav Cohen; Steffen Jung; Amos Tanay; Ido Amit
Journal:  Science       Date:  2014-02-14       Impact factor: 47.728

6.  Extracting a cellular hierarchy from high-dimensional cytometry data with SPADE.

Authors:  Peng Qiu; Erin F Simonds; Sean C Bendall; Kenneth D Gibbs; Robert V Bruggner; Michael D Linderman; Karen Sachs; Garry P Nolan; Sylvia K Plevritis
Journal:  Nat Biotechnol       Date:  2011-10-02       Impact factor: 54.908

7.  The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells.

Authors:  Cole Trapnell; Davide Cacchiarelli; Jonna Grimsby; Prapti Pokharel; Shuqiang Li; Michael Morse; Niall J Lennon; Kenneth J Livak; Tarjei S Mikkelsen; John L Rinn
Journal:  Nat Biotechnol       Date:  2014-03-23       Impact factor: 54.908

8.  Decoding the regulatory network of early blood development from single-cell gene expression measurements.

Authors:  Victoria Moignard; Steven Woodhouse; Laleh Haghverdi; Andrew J Lilly; Yosuke Tanaka; Adam C Wilkinson; Florian Buettner; Iain C Macaulay; Wajid Jawaid; Evangelia Diamanti; Shin-Ichi Nishikawa; Nir Piterman; Valerie Kouskoff; Fabian J Theis; Jasmin Fisher; Berthold Göttgens
Journal:  Nat Biotechnol       Date:  2015-02-09       Impact factor: 54.908

  8 in total
  26 in total

Review 1.  Understanding development and stem cells using single cell-based analyses of gene expression.

Authors:  Pavithra Kumar; Yuqi Tan; Patrick Cahan
Journal:  Development       Date:  2017-01-01       Impact factor: 6.868

2.  Single-Cell Transcriptome Analysis of Developing and Regenerating Spiral Ganglion Neurons.

Authors:  Kelvin Y Kwan
Journal:  Curr Pharmacol Rep       Date:  2016-08-04

3.  Precursors of human CD4+ cytotoxic T lymphocytes identified by single-cell transcriptome analysis.

Authors:  Veena S Patil; Ariel Madrigal; Benjamin J Schmiedel; James Clarke; Patrick O'Rourke; Aruna D de Silva; Eva Harris; Bjoern Peters; Gregory Seumois; Daniela Weiskopf; Alessandro Sette; Pandurangan Vijayanand
Journal:  Sci Immunol       Date:  2018-01-19

4.  A step-by-step workflow for low-level analysis of single-cell RNA-seq data with Bioconductor.

Authors:  Aaron T L Lun; Davis J McCarthy; John C Marioni
Journal:  F1000Res       Date:  2016-08-31

Review 5.  Mining Cancer Transcriptomes: Bioinformatic Tools and the Remaining Challenges.

Authors:  Thomas Milan; Brian T Wilhelm
Journal:  Mol Diagn Ther       Date:  2017-06       Impact factor: 4.074

6.  Single-Cell Transcriptome Analysis of Neural Stem Cells.

Authors:  Ying Li; Jeremy Anderson; Kelvin Y Kwan; Li Cai
Journal:  Curr Pharmacol Rep       Date:  2017-02-27

Review 7.  Single-Cell Genomics for Virology.

Authors:  Angela Ciuffi; Sylvie Rato; Amalio Telenti
Journal:  Viruses       Date:  2016-05-04       Impact factor: 5.048

Review 8.  Design and computational analysis of single-cell RNA-sequencing experiments.

Authors:  Rhonda Bacher; Christina Kendziorski
Journal:  Genome Biol       Date:  2016-04-07       Impact factor: 13.583

9.  Identifying and removing the cell-cycle effect from single-cell RNA-Sequencing data.

Authors:  Martin Barron; Jun Li
Journal:  Sci Rep       Date:  2016-09-27       Impact factor: 4.379

10.  CellTree: an R/bioconductor package to infer the hierarchical structure of cell populations from single-cell RNA-seq data.

Authors:  David A duVerle; Sohiya Yotsukura; Seitaro Nomura; Hiroyuki Aburatani; Koji Tsuda
Journal:  BMC Bioinformatics       Date:  2016-09-13       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.