Literature DB >> 28541377

ASAP: a web-based platform for the analysis and interactive visualization of single-cell RNA-seq data.

Vincent Gardeux1,2, Fabrice P A David2,3, Adrian Shajkofci1, Petra C Schwalie1,2, Bart Deplancke1,2.   

Abstract

MOTIVATION: Single-cell RNA-sequencing (scRNA-seq) allows whole transcriptome profiling of thousands of individual cells, enabling the molecular exploration of tissues at the cellular level. Such analytical capacity is of great interest to many research groups in the world, yet these groups often lack the expertise to handle complex scRNA-seq datasets.
RESULTS: We developed a fully integrated, web-based platform aimed at the complete analysis of scRNA-seq data post genome alignment: from the parsing, filtering and normalization of the input count data files, to the visual representation of the data, identification of cell clusters, differentially expressed genes (including cluster-specific marker genes), and functional gene set enrichment. This Automated Single-cell Analysis Pipeline (ASAP) combines a wide range of commonly used algorithms with sophisticated visualization tools. Compared with existing scRNA-seq analysis platforms, researchers (including those lacking computational expertise) are able to interact with the data in a straightforward fashion and in real time. Furthermore, given the overlap between scRNA-seq and bulk RNA-seq analysis workflows, ASAP should conceptually be broadly applicable to any RNA-seq dataset. As a validation, we demonstrate how we can use ASAP to simply reproduce the results from a single-cell study of 91 mouse cells involving five distinct cell types.
AVAILABILITY AND IMPLEMENTATION: The tool is freely available at asap.epfl.ch and R/Python scripts are available at github.com/DeplanckeLab/ASAP. CONTACT: bart.deplancke@epfl.ch. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author(s) 2017. Published by Oxford University Press.

Entities:  

Mesh:

Year:  2017        PMID: 28541377      PMCID: PMC5870842          DOI: 10.1093/bioinformatics/btx337

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 Introduction

Several bioinformatic platforms have been developed that aim to lower the entry point to ‘-omic’ type of analyses (Afgan ; Reich ). The latter include pipelines dedicated to single-cell analyses such as SINCERA (Guo ), SEURAT (Satija ), MAST (Finak ), PAGODA (Fan ) or SC3 (Kiselev ). However, these pipelines are embedded in R which makes them still computationally complex. For example, SC3 has preprocessing abilities using the scater package (McCarthy ) but has an interactive component that focuses mainly on the clustering part. Indeed, most of the other available pipelines lack an interactive visualization component as well as integration of a broad range of available single-cell data processing algorithms. In response, several valuable platforms have recently been developed that integrate graphics components. These include SCell (Diaz ), Sincell (Julia ), Fastproject (DeTomaso and Yosef, 2016), or START (Nelson ). These tools are embedded in stand-alone applications and cover more comprehensively the whole RNA-seq analysis pipeline, yet, they still lack key features. For example, FastProject performs filtering and visualization but no further analysis. SCell implements RUVg normalization only, and visualization is limited to PCA. Moreover, SCell lacks marker gene identification (based on differential gene expression analysis) or functional gene set enrichment capacities. Finally, these pipelines require local installation of the software, which can be time-consuming or even daunting. To alleviate these constraints, we developed ASAP, a fully integrated, web-based pipeline aimed at the complete analysis of scRNA-seq data post genome alignment. Our choice of rendering ASAP completely web-based was motivated by the fact that fewer users are inclined to install and update manually their tools, which is no longer required with web 2.0 software. ASAP allows the user to easily select and compare common, as well as single-cell specific algorithms, and provides an interactive visualization of the results. ASAP supports users in the data interpretation process by its fast speed, running the whole analysis pipeline in minutes, and by providing on-the-go visualization, clustering, differential gene expression analysis, and enrichment functionality. ASAP, to our knowledge, is currently the only tool that combines in-depth analysis features and sophisticated visualization for single-cell data in one unique platform.

2 Materials and methods

ASAP is a web-based application written in Ruby on Rails. The core structure is completely independent from any currently hosted web application (which are mostly coded in R/Shiny). This effectively makes the platform autonomous and allows the implementation of any tool independent of its source language. Currently, the server runs codes in R, Python and Java, and this process is invisible for the user, who only requires a web browser without prior installation of any development tool. The current list of methods that is included in ASAP is shown in Figure 1 and detailed in Supplementary Table S1. Current and past versions are visible in the ‘Help’ page of the website (ASAP is versioned according to tool versions).
Fig. 1

ASAP pipeline. The figure depicts the complete pipeline, including tools, that is implemented in ASAP. The user starts by uploading a count matrix (or a normalized matrix) of gene expression after which either the default pipeline or different filtering algorithms can be selected. After the normalization step, the user can apply different dimensionality reduction methods to visualize the data in 2D or 3D. The user can interactively select samples, or run clustering algorithms to perform differential gene expression analysis. Finally, the selected gene list can be analyzed for enrichment in biological modules or pathways such as the Gene Ontology or KEGG. All tools are referenced in Supplementary Table S1

ASAP pipeline. The figure depicts the complete pipeline, including tools, that is implemented in ASAP. The user starts by uploading a count matrix (or a normalized matrix) of gene expression after which either the default pipeline or different filtering algorithms can be selected. After the normalization step, the user can apply different dimensionality reduction methods to visualize the data in 2D or 3D. The user can interactively select samples, or run clustering algorithms to perform differential gene expression analysis. Finally, the selected gene list can be analyzed for enrichment in biological modules or pathways such as the Gene Ontology or KEGG. All tools are referenced in Supplementary Table S1 The current implementation of ASAP relies on the delayed::job framework which automatically creates and queues jobs when the user asks to run a particular method. This allows the application to be perfectly scalable to any IT architecture and prevents major slowdown of the website. Of course, the job execution time scales with the number of users and the host’s computational power. But this will be mainly dependent on the available cores/RAM on the server that hosts ASAP. ASAP has also full compatibility with the last versions of Chrome, Mozilla and Safari. The uploaded user data is protected by an anonymous registration system which keeps the user data private. A sandbox also allows any user to analyze the example project or upload his own data without prior registration. However, the data is destroyed when the user’s session ends.

3 Results

As a proof-of-concept, we re-analyzed data from (Dueck ), in which scRNA-seq was used to study gene expression variation across five mouse cell types, involving 91 cells. We demonstrate that ASAP is capable of replicating the main findings of this study in minutes in straightforward fashion (Supplementary Figs S1–S14). We also made these data available as a demo study on the ASAP front page, which is available without registration. It is important to note that, despite the fact that ASAP is primarily dedicated to single-cell analysis, most of the tools can be employed for bulk RNA-seq analysis as well, which makes the pipeline more versatile and universal. ASAP will be further developed as we commit to adding more functionalities and species handling on a continuous basis. We also plan to add an automatic report generation functionality, aiming to summarize the employed methods together with figures, version, citation and parameters. The database for functional enrichment analysis will remain automatically updated through a CRON job, and more databases will be added to cover links to oncogenes, drugs, as well as additional species.

Funding

This work has been supported by funds from the Swiss National Science Foundation (#31003A_162735 and #IZLIZ3_156815) and by Institutional support from the EPFL and Human Frontier Science Program LT001032/2013 (to PCS). Conflict of Interest: none declared. Click here for additional data file. Click here for additional data file.
  13 in total

1.  GenePattern 2.0.

Authors:  Michael Reich; Ted Liefeld; Joshua Gould; Jim Lerner; Pablo Tamayo; Jill P Mesirov
Journal:  Nat Genet       Date:  2006-05       Impact factor: 38.330

2.  The START App: a web-based RNAseq analysis and visualization resource.

Authors:  Jonathan W Nelson; Jiri Sklenar; Anthony P Barnes; Jessica Minnier
Journal:  Bioinformatics       Date:  2017-02-01       Impact factor: 6.937

3.  SCell: integrated analysis of single-cell RNA-seq data.

Authors:  Aaron Diaz; Siyuan J Liu; Carmen Sandoval; Alex Pollen; Tom J Nowakowski; Daniel A Lim; Arnold Kriegstein
Journal:  Bioinformatics       Date:  2016-04-19       Impact factor: 6.937

4.  Spatial reconstruction of single-cell gene expression data.

Authors:  Rahul Satija; Jeffrey A Farrell; David Gennert; Alexander F Schier; Aviv Regev
Journal:  Nat Biotechnol       Date:  2015-04-13       Impact factor: 54.908

5.  FastProject: a tool for low-dimensional analysis of single-cell RNA-Seq data.

Authors:  David DeTomaso; Nir Yosef
Journal:  BMC Bioinformatics       Date:  2016-08-23       Impact factor: 3.169

6.  Characterizing transcriptional heterogeneity through pathway and gene set overdispersion analysis.

Authors:  Jean Fan; Neeraj Salathia; Rui Liu; Gwendolyn E Kaeser; Yun C Yung; Joseph L Herman; Fiona Kaper; Jian-Bing Fan; Kun Zhang; Jerold Chun; Peter V Kharchenko
Journal:  Nat Methods       Date:  2016-01-18       Impact factor: 28.547

7.  The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2016 update.

Authors:  Enis Afgan; Dannon Baker; Marius van den Beek; Daniel Blankenberg; Dave Bouvier; Martin Čech; John Chilton; Dave Clements; Nate Coraor; Carl Eberhard; Björn Grüning; Aysam Guerler; Jennifer Hillman-Jackson; Greg Von Kuster; Eric Rasche; Nicola Soranzo; Nitesh Turaga; James Taylor; Anton Nekrutenko; Jeremy Goecks
Journal:  Nucleic Acids Res       Date:  2016-05-02       Impact factor: 16.971

8.  Scater: pre-processing, quality control, normalization and visualization of single-cell RNA-seq data in R.

Authors:  Davis J McCarthy; Kieran R Campbell; Aaron T L Lun; Quin F Wills
Journal:  Bioinformatics       Date:  2017-04-15       Impact factor: 6.937

9.  MAST: a flexible statistical framework for assessing transcriptional changes and characterizing heterogeneity in single-cell RNA sequencing data.

Authors:  Greg Finak; Andrew McDavid; Masanao Yajima; Jingyuan Deng; Vivian Gersuk; Alex K Shalek; Chloe K Slichter; Hannah W Miller; M Juliana McElrath; Martin Prlic; Peter S Linsley; Raphael Gottardo
Journal:  Genome Biol       Date:  2015-12-10       Impact factor: 13.583

10.  SINCERA: A Pipeline for Single-Cell RNA-Seq Profiling Analysis.

Authors:  Minzhe Guo; Hui Wang; S Steven Potter; Jeffrey A Whitsett; Yan Xu
Journal:  PLoS Comput Biol       Date:  2015-11-24       Impact factor: 4.475

View more
  29 in total

1.  CD29 identifies IFN-γ-producing human CD8+ T cells with an increased cytotoxic potential.

Authors:  Benoît P Nicolet; Aurélie Guislain; Floris P J van Alphen; Raquel Gomez-Eerland; Ton N M Schumacher; Maartje van den Biggelaar; Monika C Wolkers
Journal:  Proc Natl Acad Sci U S A       Date:  2020-03-11       Impact factor: 11.205

2.  Single-cell RNA-seq clustering: datasets, models, and algorithms.

Authors:  Lihong Peng; Xiongfei Tian; Geng Tian; Junlin Xu; Xin Huang; Yanbin Weng; Jialiang Yang; Liqian Zhou
Journal:  RNA Biol       Date:  2020-03-01       Impact factor: 4.652

3.  BioJupies: Automated Generation of Interactive Notebooks for RNA-Seq Data Analysis in the Cloud.

Authors:  Denis Torre; Alexander Lachmann; Avi Ma'ayan
Journal:  Cell Syst       Date:  2018-11-14       Impact factor: 10.304

4.  Intestinal stem cell aging signature reveals a reprogramming strategy to enhance regenerative potential.

Authors:  Christian M Nefzger; Thierry Jardé; Akanksha Srivastava; Jan Schroeder; Fernando J Rossello; Katja Horvay; Mirsada Prasko; Jacob M Paynter; Joseph Chen; Chen-Fang Weng; Yu B Y Sun; Xiaodong Liu; Eva Chan; Nikita Deshpande; Xiaoli Chen; Y Jinhua Li; Jahnvi Pflueger; Rebekah M Engel; Anja S Knaupp; Kirill Tsyganov; Susan K Nilsson; Ryan Lister; Owen J L Rackham; Helen E Abud; Jose M Polo
Journal:  NPJ Regen Med       Date:  2022-06-16

5.  anexVis: visual analytics framework for analysis of RNA expression.

Authors:  Diem-Trang Tran; Tian Zhang; Ryan Stutsman; Matthew Might; Umesh R Desai; Balagurunathan Kuberan
Journal:  Bioinformatics       Date:  2018-07-15       Impact factor: 6.937

Review 6.  Single-cell RNA sequencing in Drosophila: Technologies and applications.

Authors:  Hongjie Li
Journal:  Wiley Interdiscip Rev Dev Biol       Date:  2020-09-16       Impact factor: 5.814

Review 7.  Methods and tools for spatial mapping of single-cell RNAseq clusters in Drosophila.

Authors:  Stephanie E Mohr; Sudhir Gopal Tattikota; Jun Xu; Jonathan Zirin; Yanhui Hu; Norbert Perrimon
Journal:  Genetics       Date:  2021-04-15       Impact factor: 4.562

Review 8.  Next-generation computational tools for interrogating cancer immunity.

Authors:  Francesca Finotello; Dietmar Rieder; Hubert Hackl; Zlatko Trajanoski
Journal:  Nat Rev Genet       Date:  2019-09-12       Impact factor: 59.581

9.  Granatum: a graphical single-cell RNA-Seq analysis pipeline for genomics scientists.

Authors:  Xun Zhu; Thomas K Wolfgruber; Austin Tasato; Cédric Arisdakessian; David G Garmire; Lana X Garmire
Journal:  Genome Med       Date:  2017-12-05       Impact factor: 11.117

10.  Dr.seq2: A quality control and analysis pipeline for parallel single cell transcriptome and epigenome data.

Authors:  Chengchen Zhao; Sheng'en Hu; Xiao Huo; Yong Zhang
Journal:  PLoS One       Date:  2017-07-03       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.