Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Genotyping in the cloud with Crossbow.

Literature DB >> 22948728

Genotyping in the cloud with Crossbow.

James Gurtowski¹, Michael C Schatz, Ben Langmead.

Abstract

Crossbow is a scalable, portable, and automatic cloud computing tool for identifying SNPs from high-coverage, short-read resequencing data. It is built on Apache Hadoop, an implementation of the MapReduce software framework. Hadoop allows Crossbow to distribute read alignment and SNP calling subtasks over a cluster of commodity computers. Two robust tools, Bowtie and SOAPsnp, implement the fundamental alignment and variant calling operations respectively, and have demonstrated capabilities within Crossbow of analyzing approximately one billion short reads per hour on a commodity Hadoop cluster with 320 cores. Through protocol examples, this unit will demonstrate the use of Crossbow for identifying variations in three different operating modes: on a Hadoop cluster, on a single computer, and on the Amazon Elastic MapReduce cloud computing service.

Entities: Chemical Disease Gene Mutation Species

Mesh：

Year: 2012 PMID： 22948728 PMCID： PMC3465669 DOI： 10.1002/0471250953.bi1503s39

Source DB: PubMed Journal: Curr Protoc Bioinformatics ISSN： 1934-3396

8 in total

1. SNP detection for massively parallel whole-genome resequencing.

Authors: Ruiqiang Li; Yingrui Li; Xiaodong Fang; Huanming Yang; Jian Wang; Karsten Kristiansen; Jun Wang
Journal: Genome Res Date: 2009-05-06 Impact factor: 9.043

2. Use of high throughput sequencing to observe genome dynamics at a single cell level.

Authors: D Parkhomchuk; V Amstislavskiy; A Soldatov; V Ogryzko
Journal: Proc Natl Acad Sci U S A Date: 2009-11-23 Impact factor: 11.205

3. Human genome 10th anniversary. Will computers crash genomics?

Authors: Elizabeth Pennisi
Journal: Science Date: 2011-02-11 Impact factor: 47.728

4. Cloud computing and the DNA data race.

Authors: Michael C Schatz; Ben Langmead; Steven L Salzberg
Journal: Nat Biotechnol Date: 2010-07 Impact factor: 54.908

5. Cloud-scale RNA-sequencing differential expression analysis with Myrna.

Authors: Ben Langmead; Kasper D Hansen; Jeffrey T Leek
Journal: Genome Biol Date: 2010-08-11 Impact factor: 13.583

6. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome.

Authors: Ben Langmead; Cole Trapnell; Mihai Pop; Steven L Salzberg
Journal: Genome Biol Date: 2009-03-04 Impact factor: 13.583

7. Deep short-read sequencing of chromosome 17 from the mouse strains A/J and CAST/Ei identifies significant germline variation and candidate genes that regulate liver triglyceride levels.

Authors: Ian Sudbery; Jim Stalker; Jared T Simpson; Thomas Keane; Alistair G Rust; Matthew E Hurles; Klaudia Walter; Dee Lynch; Lydia Teboul; Steve D Brown; Heng Li; Zemin Ning; Joseph H Nadeau; Colleen M Croniger; Richard Durbin; David J Adams
Journal: Genome Biol Date: 2009-10-13 Impact factor: 13.583

8. Searching for SNPs with cloud computing.

Authors: Ben Langmead; Michael C Schatz; Jimmy Lin; Mihai Pop; Steven L Salzberg
Journal: Genome Biol Date: 2009-11-20 Impact factor: 13.583

8 in total

9 in total

1. Survey of gene splicing algorithms based on reads.

Authors: Xiuhua Si; Qian Wang; Lei Zhang; Ruo Wu; Jiquan Ma
Journal: Bioengineered Date: 2017-09-21 Impact factor: 3.269

Review 2. Applications of the MapReduce programming framework to clinical big data analysis: current landscape and future trends.

Authors: Emad A Mohammed; Behrouz H Far; Christopher Naugler
Journal: BioData Min Date: 2014-10-29 Impact factor: 2.522

Genotyping in the cloud with Crossbow.

1. SNP detection for massively parallel whole-genome resequencing.

2. Use of high throughput sequencing to observe genome dynamics at a single cell level.

3. Human genome 10th anniversary. Will computers crash genomics?

4. Cloud computing and the DNA data race.

5. Cloud-scale RNA-sequencing differential expression analysis with Myrna.

6. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome.

7. Deep short-read sequencing of chromosome 17 from the mouse strains A/J and CAST/Ei identifies significant germline variation and candidate genes that regulate liver triglyceride levels.

8. Searching for SNPs with cloud computing.

1. Survey of gene splicing algorithms based on reads.

Review 2. Applications of the MapReduce programming framework to clinical big data analysis: current landscape and future trends.

Review 3. Next generation distributed computing for cancer research.

4. A quantitative assessment of the Hadoop framework for analyzing massively parallel DNA sequencing data.

5. Closha: bioinformatics workflow system for the analysis of massive sequencing data.

6. BiSpark: a Spark-based highly scalable aligner for bisulfite sequencing data.

7. Developing eThread pipeline using SAGA-pilot abstraction for large-scale structural bioinformatics.

Review 8. Big Data Application in Biomedical Research and Health Care: A Literature Review.

9. SeqVItA: Sequence Variant Identification and Annotation Platform for Next Generation Sequencing Data.