| Literature DB >> 32479607 |
Vahid Jalili1, Enis Afgan2, Qiang Gu1, Dave Clements2, Daniel Blankenberg3, Jeremy Goecks1, James Taylor2, Anton Nekrutenko4.
Abstract
Galaxy (https://galaxyproject.org) is a web-based computational workbench used by tens of thousands of scientists across the world to analyze large biomedical datasets. Since 2005, the Galaxy project has fostered a global community focused on achieving accessible, reproducible, and collaborative research. Together, this community develops the Galaxy software framework, integrates analysis tools and visualizations into the framework, runs public servers that make Galaxy available via a web browser, performs and publishes analyses using Galaxy, leads bioinformatics workshops that introduce and use Galaxy, and develops interactive training materials for Galaxy. Over the last two years, all aspects of the Galaxy project have grown: code contributions, tools integrated, users, and training materials. Key advances in Galaxy's user interface include enhancements for analyzing large dataset collections as well as interactive tools for exploratory data analysis. Extensions to Galaxy's framework include support for federated identity and access management and increased ability to distribute analysis jobs to remote resources. New community resources include large public servers in Europe and Australia, an increasing number of regional and local Galaxy communities, and substantial growth in the Galaxy Training Network.Entities:
Mesh:
Year: 2020 PMID: 32479607 PMCID: PMC7319590 DOI: 10.1093/nar/gkaa434
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Members of the four major regional Galaxy teams–Africa, Australia, Europe and North America
| Region | Members | Affiliation |
|---|---|---|
| Africa | Christopher Barnett, Tharindu Senapathi | Chemistry Department and Scientific Computing Research Unit at the University of Cape Town |
| Thoba Lose, Ziphozakhe Mashologu, Peter van Heusden | South African National Bioinformatics Institute, University of the Western Cape, South Africa | |
| Australia | Catherine Bromhead, Simon Gladman, Nuwan Goonasekera, Christina Hall, Andrew Lonie | Melbourne Bioinformatics, University of Melbourne, Melbourne, Victoria, Australia |
| Maria Doyle | Peter MacCallum Cancer Centre, Melbourne, Victoria, Australia | |
| Thom Cuddihy, Igor Makunin, Gareth Price, Nick Rhodes, Michael Thang | QFAB Bioinformatics, QCIF, Brisbane, Queensland, Australia | |
| Europe | Loraine Brillet-Guéguen, Gildas Le Corguillé | ABiMS, Roscoff, France |
| Christophe Antoniewski | ARTbio, CNRS and Sorbonne Université, Paris France | |
| Léa Bellenger | ARTbio, INSERM and Sorbonne Université, Paris, France | |
| Naïra Naouar | ARTbio, Sorbonne Université, Paris, France | |
| Nadia Goué | AuBi, Mesocentre, Clermont Auvergne University, France | |
| Saskia Hiltemann, Youri Hoogstrate, Bas Horsman, Rick Jansen, Yunlei Li, Andrew Stubbs, David van Zessen | Bioinformatics, Erasmus MC Cancer Institute, Rotterdam, Netherlands | |
| Frederik Coppens, Bert Droesbeke, Ignacio Eguinoa, Michiel Van Bel | Center for Plant Systems Biology, Vlaams Instituut voor Biotechnologie, Ghent, Belgium | |
| Jean-François Dufayard, Maryline Summo | CIRAD, Montpellier, France | |
| Anshu Bhardwaj | CSIR-Institute of Microbial Technology, France | |
| Tomas Klingström | Department of Animal Breeding and Genetics, Swedish University of Agricultural Sciences, Uppsala, Sweden | |
| Federico Zambelli | Department of Biosciences, University of Milan, Milano, Italy | |
| Rolf Backofen, Bérénice Batut, Simon Bray, Gianmauro Cuccuru, Anika Erxleben, Stephan Flemming, Björn Grüning, Alireza Khanteymoori, Anup Kumar, Jan Leendertse, Wolfgang Maier, Helena Rasche, Mehmet Tekman, Joachim Wolff, Oleg Zharkov | Department of Computer Science, Albert-Ludwigs-University Freiburg, Freiburg, Germany | |
| Anne Fouilloux | Department of Geosciences, University of Oslo, Norway | |
| Florence Combes, Yves Vandenbrouck | Department of Health, CEA, Grenoble, France | |
| Nicola Soranzo | Earlham Institute, Norwich Research Park, Norwich, UK | |
| Lucille Lopez-Delisle | EPFL SV ISREC UPDUB, 1015 Lausanne, Switzerland | |
| Pablo Moreno | European Bioinformatics Institute (EMBL-EBI) | |
| Hans-Rudolf Hotz | Friedrich Miescher Institute for Biomedical Research, Basel, Switzerland | |
| Sarah Maman | GenPhySE, Université de Toulouse, INRA, INPT, ENVT, Castanet Tolosan, France | |
| Matthias Bernt | Helmholtz Centre for Environmental Research, UFZ, Young Investigators Group Bioinformatics and Transcriptomics, Leipzig, Germany | |
| Anthony Bretaudeau | IGEPP, INRAE, Institut Agro, Univ Rennes, Rennes, France | |
| Timothy Dudgeon | Informatics Matters Ltd. | |
| Olivier Inizan, Valentin Loux | INRAE, Jouy-en-Josas, France | |
| Kenzo-Hugo Hillion, Valentin Marcon, Fabien Mareuil, Hervé Ménager, Rémi Planel | Institut Pasteur, Paris, France | |
| Marco Antonio Tangaro | Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council, Bari, Italy | |
| Alexis Dereeper | Institute of Research for Development, Marseille, France | |
| Melanie Föll | Institute of Surgical Pathology, Medical Center, Albert-Ludwigs-University Freiburg, Freiburg, Germany | |
| Peter Cock | James Hutton Institute, UK | |
| Peter Selten | KWS SAAT SE & Co. KGaA | |
| Ruben Vorderman | Leiden University Medical Center, Netherlands | |
| Alan Amossé, Yvan Le Bras, Coline Royaux | Museum National d’Histoire Naturelle, Paris, France | |
| Franck Giacomoni | PFEM, INRAE, Saint Genès Champanelle, France | |
| Thanh Le-Viet, Andrew Page | Quadram Institute Bioscience, Norwich Research Park, Norwich, UK | |
| Thomas Lawson | School of Biosciences, University of Birmingham, UK | |
| Olivier Sallou | Univ Rennes, Inria, CNRS, IRISA, Rennes France | |
| Ralf Weber | University of Birmingham, UK | |
| Krzysztof Poterlowicz | University of Bradford, UK | |
| Ivan Kuzmin | University of Tartu, Estonia | |
| North America | Dan Fornika | BC Centre for Disease Control, Canada |
| Carrie Ganote | Bioinformatics Analyst at Indiana University, USA | |
| Dave Bouvier, Martin Čech, John Chilton, Nate Coraor, Assunta DeSanto, Jennifer Hillman-Jackson, Kaivan Kamali, Nick Keener, Delphine Lariviere, Anton Nekrutenko, Nick Stoler, Marius van den Beek | Department of Biochemistry and Molecular Biology, Penn State University, University Park, PA, USA | |
| Enis Afgan, Dannon Baker, Dave Clements, Sergey Golitsynskiy, Juleen Graham, Aysam Guerler, Mohammad Heydarian, Alexandru Mahmoud, Alex Ostrovsky, Nathan Roach, James Taylor, Jenn Vessio | Department of Biology, Johns Hopkins University, Baltimore, MD, USA | |
| Jeremy Goecks, Qiang Gu, Mason Houtz, Vahid Jalili, Luke Sargent | Department of Biomedical Engineering, School of Medicine, Oregon Health and Science University, Portland, OR, USA | |
| Michael Schatz | Dept. of Computer Science and Biology, Johns Hopkins University, Baltomore, MD, USA | |
| Daniel Blankenberg, Jayadev Joshi, Vijay Nagampalli | Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH, USA | |
| Greg Von Kuster | Huck Institutes of the Life Sciences, Penn State University, University Park, PA, USA | |
| Robert Leach, Lance Parsons | Lewis-Sigler Institute of Integrative Genomics, Princeton University, USA | |
| Brad Langhorst | New England Biolabs, USA | |
| Philip Mabon, Aaron Petkau, Jeffrey Thiessen | Public Health Agency of Canada, Canada | |
| Arthur Eschenlauer, Tim Griffin, Pratik Jagtap, James Johnson, Praveen Kumar, Subina Mehta | University of Minnesota, Minnesota, Minneapolis, MN, USA |
Figure 1.Examples of datasets collections. Panel A illustrates a simple list of n datasets that are encapsulated as a single collection item. Panel B illustrates n samples with nested relation (datasets of forward and reverse reads) represented as a collection item. Panel C is a screenshot of a collection and dataset items in Galaxy's history. Here, BWA-MEM (19) is generating 254 BAM datasts, which can be represented as a collection, or 254 individual datasets items (four shown here).
Figure 2.The figure illustrates a way for expressing hierarchical dataset relationships in Galaxy and use them in tools. Panel B shows four datasets representing two conditions (i.e. smoker and non-smoker) each with two replicates, organized under [Non-]Smoker collection using Galaxy's Collection Builder (the rules used for building this collection are given in Supplementary Material). These datasets are assigned with group tags (e.g. Smoker and Replicate 1 as shown in the figure), which can then be used to simplify their selection as inputs for tools. Panel A illustrates specifying these datasets as inputs for DESeq2 (2).
Figure 3.Example of an Interactive Galaxy Tool (IGT) for exploratory data analysis. The tool uses the same configuration file format as any other Galaxy tool (see Supplementary Figure S1). After processing single-cell RNA-seq data in Galaxy, users run the cell × gene tool through a standard Galaxy tool interface (panel A). Users can then use the tool to interactively explore their data, and perform server-side analysis on-the-fly. For instance, panel B shows RStudio launched within the IGT framework from the usegalaxy.eu server.
Figure 4.Streamlined data flow for isolated user jobs. As a user submits a job to Galaxy, Galaxy schedules the job as a self-contained, isolated unit passing only metadata about the job inputs (which can be a collection of datasets). Each job will communicate with the relevant data store to retrieve input data, execute the job, and persist the data. This method enables a more secure and efficient job execution.