| Literature DB >> 28025200 |
Andrian Yang1,2, Michael Troup1, Peijie Lin1,2, Joshua W K Ho1,2.
Abstract
Summary: Single-cell RNA-seq (scRNA-seq) is increasingly used in a range of biomedical studies. Nonetheless, current RNA-seq analysis tools are not specifically designed to efficiently process scRNA-seq data due to their limited scalability. Here we introduce Falco, a cloud-based framework to enable paralellization of existing RNA-seq processing pipelines using big data technologies of Apache Hadoop and Apache Spark for performing massively parallel analysis of large scale transcriptomic data. Using two public scRNA-seq datasets and two popular RNA-seq alignment/feature quantification pipelines, we show that the same processing pipeline runs 2.6-145.4 times faster using Falco than running on a highly optimized standalone computer. Falco also allows users to utilize low-cost spot instances of Amazon Web Services, providing a ∼65% reduction in cost of analysis. Availability and Implementation: Falco is available via a GNU General Public License at https://github.com/VCCRI/Falco/. Contact: j.ho@victorchang.edu.au. Supplementary information: Supplementary data are available at Bioinformatics online.Mesh:
Substances:
Year: 2017 PMID: 28025200 DOI: 10.1093/bioinformatics/btw732
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937