| Literature DB >> 26428290 |
Paul Hodor1, Amandeep Chawla1, Andrew Clark1, Lauren Neal1.
Abstract
UNLABELLED: : One of the solutions proposed for addressing the challenge of the overwhelming abundance of genomic sequence and other biological data is the use of the Hadoop computing framework. Appropriate tools are needed to set up computational environments that facilitate research of novel bioinformatics methodology using Hadoop. Here, we present cl-dash, a complete starter kit for setting up such an environment. Configuring and deploying new Hadoop clusters can be done in minutes. Use of Amazon Web Services ensures no initial investment and minimal operation costs. Two sample bioinformatics applications help the researcher understand and learn the principles of implementing an algorithm using the MapReduce programming pattern.Entities:
Mesh:
Year: 2015 PMID: 26428290 PMCID: PMC4708102 DOI: 10.1093/bioinformatics/btv553
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.Architecture of a Hadoop cluster managed by cl-dash