| Literature DB >> 23181507 |
Enis Afgan1, Brad Chapman, James Taylor.
Abstract
BACKGROUND: Cloud computing provides an infrastructure that facilitates large scale computational analysis in a scalable, democratized fashion, However, in this context it is difficult to ensure sharing of an analysis environment and associated data in a scalable and precisely reproducible way.Entities:
Mesh:
Year: 2012 PMID: 23181507 PMCID: PMC3556322 DOI: 10.1186/1471-2105-13-315
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
A list and an overview of the pertinent projects related to the cloud infrastructure management
| (Sun/Oracle/Open) Grid Engine | Enables a set of computers to be composed into a compute cluster, enabling user to submit compute jobs via a unified interface. CloudMan is currently using the most recent open source version of Sun Grid Engine. |
| Condor | Enables high throughput computing on distributed compute resources, providing features such as job dispatching and monitoring. We are actively looking to integrate Condor into CloudMan. |
| Rocks | Focuses on compute cluster configuration and management with no application-level integration (e.g., data sources, dependent tools, scaling) or deployment bundling and sharing. |
| OSCAR | Focuses on allowing users to install, administer, and program a dedicated compute cluster with a range of installed packages, with the default set of packages focusing on scientific computing. |
| Eucalyptus | These are in many ways similar cloud middleware projects enabling Infrastructure-as-a-Service (IaaS) management of a compute infrastructure or a datacenter. These projects provide building blocks (e.g., virtual machines, block storage, object storage, networking) for assembling higher-level application services, as the one described in this paper. |
| OpenNebula | |
| OpenStack | |
| StarCluster | Enables a general-purpose compute clusters to be easily deployed in the AWS environment using the command line. Although feature-full, at the moment, StarCluster operates only in the AWS context and provides no notion of deployment sharing, integration with specific applications and data sources, or a graphical management interface. |
| DeltaCloud API | These provide a uniform API allowing a standardized way of programmatically communicating with a range of clouds without needing to differentiate between those. We have and are exploring options of using such libraries internally. |
| Libcloud API | |
| CloudInitD | Enables a contextualization hook to be made available in a given cloud instance allowing one to customize that particular instance at runtime by providing explicit system-level instructions that should be executed by the system at boot time; such functionality is an integral part of any IaaS cloud middleware. |
| Puppet | These projects fall under the category of resource configuration management allowing one to provide a detailed recipe that is retrieved by a given machine from a predetermined server at boot time. The recipe is then executed allowing the given machine to be configured as specified in the recipe. The specified tools ensure the recipe is properly executed and a machine configured as instructed. However, once configured, these solutions do not focus on subsequent user-level application management of the infrastructure in a cohesive manner (i.e., as a compute cluster, a cooperative deployment, balancing the workload, data persistence) and thus represent a lower level of interaction with the infrastructure than what is described in our manuscript. |
| Chef | |
| LCFG | |
| CFEngine | |
| Bcfg2 | |
Figure 1A conceptual representation of CloudMan’s architectural components that facilitate customization and sharing of instances. Each instance is self-contained by keeping track of the configuration components that make up the deployment. As a result, it is possible to create custom versions of the default set of tools or indices. For example, instance A is using the default configuration (EBS snapshots colored in blue) while instance B has been customized (colored in yellow and orange). Once customizations are created, they can be shared with specific users (denoted with ‘S’) or made public (denoted with ‘P’). Instances are shared as point in time data and configuration. In the shown example, instance B has been shared at two time points. Any derived instances (instance C) will use the shared instance configuration settings upon startup. Each instance has its own user data; derived instances use the shared instance’s data and build on top of it.
Figure 2Web interface used to share an instance and later create a derived instance. (A) A user can chose to make an instance public or share it with specific user or group of users. (B) Once shared, the time point share is assigned a share ID string. This string is shared with users, who simply (C) provide the share ID string at the time of new cluster instantiation to create a derived cluster.