| Literature DB >> 26847232 |
Benjamin K Johnson1, Matthew B Scholz2, Tracy K Teal3, Robert B Abramovitch4.
Abstract
BACKGROUND: Many tools exist in the analysis of bacterial RNA sequencing (RNA-seq) transcriptional profiling experiments to identify differentially expressed genes between experimental conditions. Generally, the workflow includes quality control of reads, mapping to a reference, counting transcript abundance, and statistical tests for differentially expressed genes. In spite of the numerous tools developed for each component of an RNA-seq analysis workflow, easy-to-use bacterially oriented workflow applications to combine multiple tools and automate the process are lacking. With many tools to choose from for each step, the task of identifying a specific tool, adapting the input/output options to the specific use-case, and integrating the tools into a coherent analysis pipeline is not a trivial endeavor, particularly for microbiologists with limited bioinformatics experience.Entities:
Mesh:
Substances:
Year: 2016 PMID: 26847232 PMCID: PMC4743240 DOI: 10.1186/s12859-016-0923-y
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1SPARTA workflow diagram. Single-end Illumina FASTQ files, a FASTA formatted reference genome, and genome feature file (gff or gtf) are given as inputs to the workflow. Trimmomatic and FastQC perform trimming of adapters and low quality bases/reads and quality assessment reports, respectively. Bowtie maps the trimmed reads to the reference. HTSeq quantifies transcript abundance. R/edgeR tests for statistically significant genes and warns the user of potential batch effects present in the analyzed data set
Fig. 2Data analysis execution time comparison between SPARTA and Rockhopper2. The two programs were compared for execution time when processing one, two, or three experimental conditions as compared to a reference condition. Both SPARTA (1.0) and Rockhopper2 (2.03) were installed and tested on an off-the-shelf iMac (2.7 GHz i5, 8 GB memory, OSX 10.11.2). Dependencies: Java (1.6.0_65), Python (2.7.9), and R (3.2.2). Data are the mean of three software executions and error bars represent the standard deviation. Data files (100,000 reads/file) utilized were the example data bundled with SPARTA