| Literature DB >> 28785418 |
Thomas R Connor1, Nicholas J Loman2, Simon Thompson3, Andy Smith2, Joel Southgate1, Radoslaw Poplawski2,3, Matthew J Bull1, Emily Richardson2, Matthew Ismail4, Simon Elwood- Thompson5, Christine Kitchen6, Martyn Guest6, Marius Bakke7, Samuel K Sheppard8, Mark J Pallen7.
Abstract
The increasing availability and decreasing cost of high-throughput sequencing has transformed academic medical microbiology, delivering an explosion in available genomes while also driving advances in bioinformatics. However, many microbiologists are unable to exploit the resulting large genomics datasets because they do not have access to relevant computational resources and to an appropriate bioinformatics infrastructure. Here, we present the Cloud Infrastructure for Microbial Bioinformatics (CLIMB) facility, a shared computing infrastructure that has been designed from the ground up to provide an environment where microbiologists can share and reuse methods and data.Entities:
Keywords: bioinformatics; cloud computing; infrastructure; metagenomics; population genomics; virtual laboratory
Mesh:
Year: 2016 PMID: 28785418 PMCID: PMC5537631 DOI: 10.1099/mgen.0.000086
Source DB: PubMed Journal: Microb Genom ISSN: 2057-5858
Fig. 1.Overview of the system. (a) The sites where the computational hardware is based. (b) High-level overview of the system and how the different software components connect to one another. (c) Compute hardware present at each of the four sites. (d) Hardware comprising the Ceph storage system at each site. (e) Type and role of network hardware used at each site.
Fig. 2.Relative performance of virtual machines running on cloud services, compared to the Cardiff University HPC system, Raven. (a) Values for each package are the mean of the wall time taken for 10 runs performed on Raven, divided by the mean wall time of 40 runs undertaken on the virtual machine on the named service. Values greater than 1 are faster than Raven, values less than 1 are slower. (b) The raw wall time values for the named software on each of the systems. The data generated as part of the benchmarking exercise is included in Supplementary File 1.
Fig. 3.CLIMB virtual machine launch workflow. A user, on logging in to the Bryn launcher interface, is presented with a list of the virtual machines they are running and are able to stop, reboot or terminate them (a). Users launch a new Genomics Virtual Laboratory (GVL) server with a minimal interface, specifying a name, the server ‘flavour’ (user or group) and an access password (b). On booting, the user accesses a webserver running on the GVL instance, which gives access to various services that are started automatically (c). The GVL provides access to a Cloudman, a Galaxy server, an administration interface, Jupyter notebook and RStudio (d, top to bottom).