| Literature DB >> 26924983 |
Rick A Fasani1, Carolina B Livi1, Dipanwita R Choudhury1, Andre Kleensang2, Mounir Bouhifd2, Salil N Pendse3, Patrick D McMullen3, Melvin E Andersen3, Thomas Hartung4, Michael Rosenberg1.
Abstract
The Human Toxome Project is part of a long-term vision to modernize toxicity testing for the 21st century. In the initial phase of the project, a consortium of six academic, commercial, and government organizations has partnered to map pathways of toxicity, using endocrine disruption as a model hazard. Experimental data is generated at multiple sites, and analyzed using a range of computational tools. While effectively gathering, managing, and analyzing the data for high-content experiments is a challenge in its own right, doing so for a growing number of -omics technologies, with larger data sets, across multiple institutions complicates the process. Interestingly, one of the most difficult, ongoing challenges has been the computational collaboration between the geographically separate institutions. Existing solutions cannot handle the growing heterogeneous data, provide a computational environment for consistent analysis, accommodate different workflows, and adapt to the constantly evolving methods and goals of a research project. To meet the needs of the project, we have created and managed The Human Toxome Collaboratorium, a shared computational environment hosted on third-party cloud services. The Collaboratorium provides a familiar virtual desktop, with a mix of commercial, open-source, and custom-built applications. It shares some of the challenges of traditional information technology, but with unique and unexpected constraints that emerge from the cloud. Here we describe the problems we faced, the current architecture of the solution, an example of its use, the major lessons we learned, and the future potential of the concept. In particular, the Collaboratorium represents a novel distribution method that could increase the reproducibility and reusability of results from similar large, multi-omic studies.Entities:
Keywords: big data; cloud computing; computational toxicology; systems toxicology; virtual machines; virtualization
Year: 2016 PMID: 26924983 PMCID: PMC4756169 DOI: 10.3389/fphar.2015.00322
Source DB: PubMed Journal: Front Pharmacol ISSN: 1663-9812 Impact factor: 5.810
Overlapping responsibilities within the Human Toxome Consortium.
| Results | Brown | JHU | Georgetown | CAAT | Hamner | Agilent |
|---|---|---|---|---|---|---|
| Cell culture | × | × | ||||
| RNA extract | × | × | × | |||
| DNA extract | × | |||||
| Metabolite extract | × | × | ||||
| qRT-PCR | × | × | × | |||
| GX microarray | × | |||||
| CGH microarray | × | |||||
| LC-MS | × | |||||
| Identified genes | × | × | × | |||
| Identified aberrations | × | × | ||||
| Identified metabolites | × | × | ||||
| Transcriptomic analysis | × | × | × | × | ||
| Genomic analysis | × | × | ||||
| Metabolomic analysis | × | × | × | |||
| Proteomic analysis | × | × | × | |||
| Integrated analysis | × | × | × |
List of Collaboratorium software.
| Application | Technology | Development |
|---|---|---|
| Agilent Pathway Architect 13.1.1 | Integrated Biology | Commercial |
| Agilent OpenLAB ELN 4.2.1.0 | Integrated Biology | Commercial |
| Hamner IDEA 1.0 | Integrated Biology | Internal |
| Agilent Feature Extraction 11.5.1.1 | Microarray | Commercial |
| Agilent QC Chart Tools 3.5.1.2 | Microarray | Commercial |
| Agilent CytoGenomics 2.9.2.4 | Microarray | Commercial |
| Agilent Genomic Workbench 7.0.4.0 | Microarray | Commercial |
| Agilent GeneSpring GX 13.1.1 | Microarray | Commercial |
| Agilent Mass Profiler Professional 13.1.1 | MS | Commercial |
| Agilent MassHunter Qualitative Analysis B.06.00 SP1 | MS | Commercial |
| Agilent MassHunter Quantitative Analysis B.06.00 SP1 | MS | Commercial |
| Agilent PCDL Manager B.04.00 SP1 | MS | Commercial |
| Agilent Pathways to PCDL B.05.00 | MS | Commercial |
| Agilent Molecular Structure Correlator B.05.00 | MS | Commercial |
| Agilent MassHunter Profinder B.06.00 SP1 | MS | Commercial |
| Strand NGS 2.1 | NGS | Commercial |
| Python 2.7.7 | Platform | Open Source |
| R 3.1.1 | Platform | Open Source |
| Oracle Java 7.0.650 | Platform | Open Source |
| Adobe Reader XI 11.0.09 | Utility | Commercial |
| Google Chrome 37 | Utility | Open Source |
| Libre Office 4.3.1.2 | Utility | Open Source |
| WinSCP 5.5.3 | Utility | Open Source |
| 7-Zip File Manager 9.20 | Utility | Open Source |
| Notepad++ 6.6.9 | Utility | Open Source |
| PuTTY 0.63 | Utility | Open Source |
Common estrogen responders.
| Gene Symbol | Regulation | FC (abs) | Description | ||||
|---|---|---|---|---|---|---|---|
| Toxome | GSE36586 | GSE8597 | GSE24592 | GSE51403 | |||
| ATP6V0A4 | Down | 1.54 | 3.38 | 1.22 | 19.15 | 3.72 | ATPase, H+ transporting, lysosomal V0 subunit a4 |
| ATP8A1 | Down | 1.43 | 3.02 | 1.79 | 22.1 | 2.4 | ATPase, aminophospholipid transporter (APLT), class I, type 8A, member 1 |
| CDC25A | Up | 1.97 | 2.03 | 2.93 | 2.67 | 2.73 | Cell division cycle 25A |
| CSTA | Down | 1.89 | 3.95 | 2.5 | 3.26 | 3.14 | Cystatin A (stefin A) |
| CTNNAL1 | Up | 1.38 | 1.71 | 1.72 | 1.6 | 2.16 | Catenin (cadherin-associated protein), alpha-like 1 |
| CTNND2 | Down | 1.42 | 1.75 | 1.79 | 4.88 | 1.73 | Catenin (cadherin-associated protein), delta 2 |
| CTSD | Up | 1.37 | 3.15 | 1.58 | 1.73 | 3.07 | Cathepsin D |
| CXCL12 | Up | 6.69 | 6.56 | 3.34 | 17.11 | 11.38 | Chemokine (C-X-C motif) ligand 12 |
| DSCC1 | Up | 1.89 | 2.22 | 4.07 | 2.49 | 2.89 | DNA replication and sister chromatid cohesion 1 |
| EGR3 | Up | 24.42 | 16.62 | 2.6 | 8.71 | 8.73 | Early growth response 3 |
| ELOVL2 | Up | 5.98 | 4.28 | 1.35 | 6.25 | 3.16 | ELOVL fatty acid elongase 2 |
| GREB1 | Up | 6.61 | 11.77 | 3.12 | 200.34 | 19.21 | Growth regulation by estrogen in breast cancer 1 |
| IGFBP3 | Down | 8.83 | 2.53 | 2.77 | 6.16 | 3.24 | Insulin-like growth factor binding protein 3 |
| IGSF1 | Up | 2.2 | 16.71 | 2.33 | 1.02 | 22.26 | Immunoglobulin superfamily, member 1 |
| MYB | Up | 4.98 | 3.4 | 2.35 | 2.19 | 3.61 | v-myb avian myeloblastosis viral oncogene homolog |
| OLFM1 | Up | 4.01 | 3.45 | 1.61 | 1.37 | 2.76 | Olfactomedin 1 |
| PGR | Up | 32.49 | 9.23 | 4.93 | 25.68 | 21.97 | Progesterone receptor |
| PMP22 | Down | 1.88 | 2.58 | 1.81 | 2.89 | 2.69 | Peripheral myelin protein 22 |
| PPIF | Up | 2.75 | 1.74 | 2.28 | 9.95 | 1.55 | Peptidylprolyl isomerase F |
| PRSS23 | Up | 3.43 | 5.86 | 1.57 | 14.76 | 5.53 | Protease, serine, 23 |
| RAB31 | Up | 9.07 | 2.84 | 2.84 | 7.79 | 3.18 | RAB31, member RAS oncogene family |
| SERPINA3 | Up | 7.11 | 4.4 | 1.89 | 1.88 | 1.79 | Serpin peptidase inhibitor, clade A (alpha-1 antiproteinase, antitrypsin), member 3 |
| SGK1 | Up | 2.95 | 3.39 | 1.96 | 7.63 | 2.86 | Serum/glucocorticoid regulated kinase 1 |
| SLC35C1 | Down | 2.41 | 1.64 | 1.72 | 1.19 | 1.36 | Solute carrier family 35 (GDP-fucose transporter), member C1 |
| SLC7A5 | Up | 2.32 | 3.12 | 1.68 | 4.59 | 2.98 | Solute carrier family 7 (amino acid transporter light chain, L system), member 5 |
| TBC1D2 | Down | 1.51 | 1.99 | 1.29 | 3 | 2 | TBC1 domain family, member 2 |
| TMEM164 | Up | 3.83 | 3.43 | 2.07 | 1.59 | 2.29 | Transmembrane protein 164 |
| UPK1A | Down | 8.28 | 4.28 | 3.17 | 5.68 | 4.96 | Uroplakin 1A |
| XBP1 | Up | 4.2 | 1.79 | 2.35 | 1.41 | 1.98 | X-box binding protein 1 |