| Literature DB >> 24454756 |
Konrad J Karczewski1, Guy Haskin Fernald1, Alicia R Martin1, Michael Snyder2, Nicholas P Tatonetti3, Joel T Dudley4.
Abstract
The increasing public availability of personal complete genome sequencing data has ushered in an era of democratized genomics. However, read mapping and variant calling software is constantly improving and individuals with personal genomic data may prefer to customize and update their variant calls. Here, we describe STORMSeq (Scalable Tools for Open-Source Read Mapping), a graphical interface cloud computing solution that does not require a parallel computing environment or extensive technical experience. This customizable and modular system performs read mapping, read cleaning, and variant calling and annotation. At present, STORMSeq costs approximately $2 and 5-10 hours to process a full exome sequence and $30 and 3-8 days to process a whole genome sequence. We provide this open-access and open-source resource as a user-friendly interface in Amazon EC2.Entities:
Mesh:
Year: 2014 PMID: 24454756 PMCID: PMC3893165 DOI: 10.1371/journal.pone.0084860
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Overview of the STORMSeq system.
The user uploads short reads to Amazon S3 and starts a webserver on Amazon EC2, which controls the mapping and variant calling pipeline. Progress can be monitored on the webserver and results are uploaded to persistent storage on Amazon S3.
Figure 2Sample output.
STORMSeq provides basic visualization for summary statistics, such as (A) genome-wide SNP density and (B) size distribution of short indels.
Approximate costs for STORMSeq.
| Analysis Type | Exome | Genome | ||
| Pipeline | SNAP | BWA | SNAP | BWA |
| Cost (Spot) | $2.26 | $1.90 | $26.42 | $32.76 |
| Cost (On-demand) | $19.68 | $8.16 | $254.20 | $129.12 |
| Time | 5 h | 10 h | 176 h | 98 h |
Note that these costs are approximate and may depend on a number of factors related to the input files.