| Literature DB >> 22514186 |
Ian Stokes-Rees1, Ian Levesque, Frank V Murphy, Wei Yang, Ashley Deacon, Piotr Sliz.
Abstract
Early stage experimental data in structural biology is generally unmaintained and inaccessible to the public. It is increasingly believed that this data, which forms the basis for each macromolecular structure discovered by this field, must be archived and, in due course, published. Furthermore, the widespread use of shared scientific facilities such as synchrotron beamlines complicates the issue of data storage, access and movement, as does the increase of remote users. This work describes a prototype system that adapts existing federated cyberinfrastructure technology and techniques to significantly improve the operational environment for users and administrators of synchrotron data collection facilities used in structural biology. This is achieved through software from the Virtual Data Toolkit and Globus, bringing together federated users and facilities from the Stanford Synchrotron Radiation Lightsource, the Advanced Photon Source, the Open Science Grid, the SBGrid Consortium and Harvard Medical School. The performance and experience with the prototype provide a model for data management at shared scientific facilities.Entities:
Mesh:
Substances:
Year: 2012 PMID: 22514186 PMCID: PMC3329960 DOI: 10.1107/S0909049512009776
Source DB: PubMed Journal: J Synchrotron Radiat ISSN: 0909-0495 Impact factor: 2.616
Transfer rates between participating sites
| Transfer rate (MB s−1) | |||||
|---|---|---|---|---|---|
| To HMS | To SSRL | To NE-CAT | To NCSA | To FNAL | |
| From HMS | 47 | 29 | 53 | 45 | |
| From SSRL | 23 | N/A | 26 | 22 | |
| From NE-CAT | 29 | N/A | N/A | N/A | |
| From NCSA | 19 | 20 | N/A | N/A | |
| From FNAL | 36 | 39 | N/A | N/A | |
Figure 1Geographic distribution of five endpoints participating in the trial of the prototype system. The flags represent, from left to right, Stanford Synchrotron Radiation Lightsource (blue, Stanford Linear Accelerator Center, Palo Alto, CA, USA), Fermi National Accelerator Laboratory (white, Batavia, IL, USA), Northeast Collaborative Access Team (green, Advanced Photon Source, Argonne, IL, USA), National Center for Supercomputing Applications (red, University of Illinois Urbana-Champaign, Champaign, IL, USA) and Harvard Medical School (yellow, Boston, MA, USA).
Figure 2Example usage scenario. User ‘Scott’ from the Sliz Laboratory at Harvard collects data at the NE-CAT beamline. It is stored temporarily to a tier 1 staging server in sliz@necat.aps.argonne.gov:/stage/sliz before being archived to tier 2 storage in sliz@necat.aps.argonne.gov:/data/sliz. Scott can use Globus Online to access his data at NE-CAT and transfer it to his personal space on the laboratory file server scott@sliz.harvard.edu:/home/scott or to his own laptop. Other laboratory members can access his data at NE-CAT owing to the mapping of all Sliz Laboratory members to the same system identity. The general public can access public archived data from 2009 in the tier 3 archival storage through a web interface. More recent archived data from 2010 and 2011 is embargoed and not available to the public.