| Literature DB >> 26126663 |
Prerana Wagle1, Miloš Nikolić2,3, Peter Frommolt4.
Abstract
BACKGROUND: Next-Generation Sequencing (NGS) has emerged as a widely used tool in molecular biology. While time and cost for the sequencing itself are decreasing, the analysis of the massive amounts of data remains challenging. Since multiple algorithmic approaches for the basic data analysis have been developed, there is now an increasing need to efficiently use these tools to obtain results in reasonable time.Entities:
Mesh:
Year: 2015 PMID: 26126663 PMCID: PMC4486389 DOI: 10.1186/s12864-015-1695-x
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1a The QuickNGS database contains meta information on samples (species, application, file locations, sample labels, lab name, library type, batch information) and sample groups (samples which are forming groups to be compared). Both can be efficiently organized by an intuitive web interface. New samples and sample groups can be inserted (Additional file 4) by following the „ + “button. b The status page on the QuickNGS database monitors time, user information and current status of each QuickNGS module on a clearly arranged website, enabling password-protected interrogation of the current status from any working location, including mobile access
Fig. 2Results report for the test run on 10 Drosophila RNA-Seq samples. After login, the interface provides links to the main database export files, QC statistics as well as visualisation plots and the user’s personal password-protected UCSC genome browser track hub
Algorithms and software tools used by QuickNGS, version 1.1.0 . The selection may be modified, updated or extended in future releases of QuickNGS, however, an up-to-date version of this table will be kept available online at the QuickNGS website
| Tool | Version | Reference | |
|---|---|---|---|
| RNA-Seq | FastQC | 0.10.1 | |
| Tophat2 | 2.0.10 | [ | |
| Cufflinks2 | 2.1.1 | [ | |
| DESeq2 | 1.4.5 | [ | |
| DEXSeq | 1.10.5 | [ | |
| UCSC Genome Browser | [ | ||
| miRNA-Seq | FastQC | 0.10.1 | |
| miRDeep2 | 0.0.5 | [ | |
| DESeq2 | 1.4.5 | [ | |
| UCSC Genome Browser | [ | ||
| ChIP-Seq | FastQC | 0.10.1 | |
| BWA | 0.7.7 | [ | |
| MACS2 | 2.0.10 | [ | |
| MEME-ChIP | 4.10.0 | [ | |
| UCSC Genome Browser | [ | ||
| WGS | FastQC | 0.10.1 | |
| BWA | 0.7.7 | [ | |
| Samtools | 0.1.19 | [ | |
| Delly | 2.0.1 | [ | |
| SnpEff | 3.4 | [ | |
| UCSC Genome Browser | [ |
Reads statistics on the test data from [18] – read counts are given in multiples of 106 (1 M = 1 million reads). Duplicate removal was not performed because this was a single-read analysis. Two samples (yw_1 and atf376_1) were treated with ribo-zero, whereas for the remaining samples, there is a significant degree of contamination with ribosomal RNA. For all samples, about the half of the reads map to the original strand because all data origin from unstranded libraries
| Label | # Reads | # Aligned | MapQ ≥ 30 | Stranded | miRNA | rRNA | Other ncRNA |
|---|---|---|---|---|---|---|---|
| yw_1 | 25.9 M | 17.4 M | 16.2 M | 50.2 % | 0.0 M | 0.3 M | 0.1 M |
| yw_2 | 37.4 M | 34.2 M | 32.5 M | 50.5 % | 0.0 M | 5.3 M | 0.2 M |
| atf376_1 | 26.9 M | 20.8 M | 19.3 M | 50.3 % | 0.0 M | 0.1 M | 0.1 M |
| atf376_2 | 36.3 M | 32.5 M | 31.3 M | 50.9 % | 0.0 M | 3.4 M | 0.2 M |
| foxo_1 | 37.6 M | 34.1 M | 32.6 M | 50.0 % | 0.0 M | 2.9 M | 0.3 M |
| foxo_2 | 39.3 M | 33.0 M | 31.7 M | 50.2 % | 0.0 M | 3.0 M | 0.3 M |
| rel_1 | 37.8 M | 34.6 M | 32.9 M | 50.0 % | 0.0 M | 2.9 M | 0.3 M |
| rel_2 | 38.1 M | 34.6 M | 33.0 M | 50.1 % | 0.0 M | 3.0 M | 0.3 M |
| atf3a_1 | 38.4 M | 34.7 M | 33.2 M | 50.3 % | 0.0 M | 3.8 M | 0.3 M |
| atf3a_2 | 38.5 M | 35.2 M | 33.7 M | 50.3 % | 0.0 M | 3.9 M | 0.3 M |
Fig. 3a Heatmap on the 10 RNA-Seq test data sets: The replicates of each genotype do not perfectly cluster together in distinct subclusters. This is likely to be cause by a combined effect of ribosomal contamination and batch effects. b The principle component analysis confirms that two samples (depicted in red) which were processed in a separate batch and with ribozero treatment cluster distantly from the remaining samples
Comparison of the technical features of QuickNGS to those of other NGS analysis workflow systems
| QuickNGS | Galaxy | GenePattern | Chipster | |
|---|---|---|---|---|
| Setup | Compute cluster plus DB and web server | Client/server system | Client/server system | Client/server system |
| Applications | RNA-Seq, miRNA-Seq, ChIP-Seq, Whole-Genome | Universal framework | RNA-Seq | RNA-Seq, miRNA-Seq, ChIP-Seq, Whole-Genome |
| Database | Metadata and results | None | None | None |
| Workflow automation | Full | Started in web interface | Started in web interface | Started in client software |
| Reproducibility/Documentation | Results kept in DB Version tracking Logfiles | Workflow files | Workflow files | Workflow files |
| Workflow flexibility | Requires shell programming | Can be changed in web interface | Can be changed in web interface | Can be edited in client software |
| Purpose of user interface | End-user access to the analysis results | Data import and start of workflows | Data import and start of workflows | Data import and start of workflows |