Literature DB >> 29868771

ggCyto: next generation open-source visualization software for cytometry.

Phu Van1, Wenxin Jiang1, Raphael Gottardo1, Greg Finak1.   

Abstract

Motivation: Open source software for computational cytometry has gained in popularity over the past few years. Efforts such as FlowCAP, the Lyoplate and Euroflow projects have highlighted the importance of efforts to standardize both experimental and computational aspects of cytometry data analysis. The R/BioConductor platform hosts the largest collection of open source cytometry software covering all aspects of data analysis and providing infrastructure to represent and analyze cytometry data with all relevant experimental, gating and cell population annotations enabling fully reproducible data analysis. Data visualization frameworks to support this infrastructure have lagged behind.
Results: ggCyto is a new open-source BioConductor software package for cytometry data visualization built on ggplot2 that enables ggplot-like functionality with the core BioConductor flow cytometry data structures. Amongst its features are the ability to transform data and axes on-the-fly using cytometry-specific transformations, plot faceting by experimental meta-data variables and partial matching of channel, marker and cell populations names to the contents of the BioConductor cytometry data structures. We demonstrate the salient features of the package using publicly available cytometry data with complete reproducible examples in a Supplementary Material. Availability and implementation: https://bioconductor.org/packages/devel/bioc/html/ggcyto.html. Supplementary information: Supplementary data are available at Bioinformatics online.

Entities:  

Mesh:

Substances:

Year:  2018        PMID: 29868771      PMCID: PMC6223365          DOI: 10.1093/bioinformatics/bty441

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 Introduction

Cytometry (FCM) is the primary assay for immune monitoring in clinical and research applications (Maecker ). Pipelines must handle preprocessing, quality control, analysis (i.e. cell clustering or manual partitioning into homogeneous groups) (O(oups)eet al., 2013; Saeys ) and visualization. Proprietary platforms, including FlowJo (Ashland, OR), WinList, FCSExpress and DIVA are the de-facto standards for end-to-end FCM data analysis. Other programming frameworks like Matlab (Matlab 7.0.4, Natick, MA: MathWorks) and Mathematica (Mathematica 9.0, Champaign, IL: Wolfram Research) provide functionality for data import and exploration [indeed, SPADE (Qiu ) was initially developed for MATLAB], but lack the general abstraction of cytometry-specific data structures helpful for data analysis. Open-source projects like R/BioConductor (R/BioC) (Gentleman ; Ihaka and Gentleman, 1996) and Python provide FCM functionality through user-contributed packages (Frelinger ). Currently 47 open source software packages in BioConductor are tagged for ‘FlowCytometry’ (http://bioconductor.org/packages/release/BiocViews.html) but only flowViz (Sarkar ) is visualization-centric and doesnli support the core BioConductor data structures used to store analyzed, gated and annotated, single-cell FCM data (see Supplementary Material). Other packages focus on different aspects of automated analysis. We introduce ggcyto, a BioConductor package for building reproducible FCM visualizations programmatically. It is built on ggplot2 (Wickham, 2009) and supports the core BioConductor cytometry data structures making it compatible with any package using those structures (see Supplementary Material).

2 ggCyto

2.1 Basic principles

To construct a plot with ggcyto users specify a data source (Fig. 1), and, analogous to ggplot2, they map plot elements to variables in the data source. With ggcyto however, users map plot axes to flow parameters (e.g. channels or markers), specify the cell population to plot, specify cytometry-specific axis transformations and potentially specify gates (e.g. elements defining cell populations) to add to the plot. These elements are built up via layers and are referred by name, mapping directly to quantities (i.e. data) in the data source. For ease of use, ggcyto supports partial string matches (Fig. 1 and Supplementary Material), particularly useful for identifying complex channel names or cell populations.
Fig. 1.

ggcyto is compatible with ungated and gated data sources represented by the core BioConductor FCM data structures (flowSet/flowFrame and GatingSet/GatingHierarchy). Plots can be constructed using the (1) autoplot or (2) ggcyto APIs, giving users more control. Custom layers control cytometry-specific plot elements including 3) data transformation

ggcyto is compatible with ungated and gated data sources represented by the core BioConductor FCM data structures (flowSet/flowFrame and GatingSet/GatingHierarchy). Plots can be constructed using the (1) autoplot or (2) ggcyto APIs, giving users more control. Custom layers control cytometry-specific plot elements including 3) data transformation

2.2 Availability

ggcyto is open-source and available on GitHub and BioConductor (https://github.com/RGLab/ggcyto/releases/tag/v1.9.5 and https://bioconductor.org/packages/devel/bioc/html/ggcyto.html).

2.3 Quick plotting with the autoplot API

The autoplot API is a quick way to build plots. It makes most of the plot decisions for the user based on domain knowledge and information encoded in the data source (Fig. 1 and Supplementary Material). For example, passing a GatingHierarchy and a vector of cell population names (defined by gated cell populations in the GatingHierarchy) creates a faceted array (one panel for each sample) of two-dimensional density plots (using hexagonal binning) of the parent cell population projected onto the dimensions of any gates defining those cell subsets (Fig. 1, Supplementary Material). The ‘CD3’ and ‘CD19’ populations shown in Figure 1 are named cell populations defined by gates in the GatingHierarchy. They should not to be confused with markers of the same name. Two-dimensional densities are chosen by autoplot because the gates defining the CD3 and CD19 cell populations are two dimensional. In cases where gates defining a cell population are one dimensional, a one-dimensional density would be plotted. In this sense, autoplot is context aware, selecting geoms appropriate for visualizing the desired cell population. Analogously, autoplot can be used to create plots from flowSet and flowFrame objects (for ungated data) or GatingHierarchy and GatingSet objects (for gated data, Fig. 1 and Supplementary Material). In the case of ungated data, the user specifies the channels/markers to visualize, rather than the cell population (since the latter is not defined).

2.4 Customizing plots with cytometry-specific layers

The ggcyto() API provides greater flexibility and customization than autoplot (Fig. 1). When using ggcyto, the layers and defaults selected by autoplot are decisions left to the user. Leveraging ggcytogs cytometry-specific layers and geoms, the user builds the plot (Fig. 1 and Supplementary Material) to include the gates, overlays (e.g. backgating), data or axis transformations, cell subpopulations and cell subpopulation statistics of interest, and specifies the faceting of plots by metadata annotations (see Supplementary Material). The ggcyto API can be particularly useful to project cell populations onto other markers (i.e. not necessarily those on which the populations are defined). The support for data transformations in ggcyto is 2-fold: ggcyto can transform the underlying data (Fig. 1), or it can transform the axes using the transformation stored in the data source (Fig. 1). These approaches are demonstrated in the Supplementary Material.

3 Examples

The functionality of ggcyto is demonstrated using the Lyoplate dataset from FlowCAP 4 (Finak ) available in the flowWorkspaceData R/BioConductor package and on the ImmuneSpace portal (Brusic ) (see the Supplementary Material for link to this data on ImmuneSpace), as well as the graft versus host disease (GvHD) data available in the flowCore R/BioConductor package. Reproducible examples with R code are in the Supplementary Material and available at http://rglab.org/ggcyto/. In future, additional cytometry data may be available via the more modern AnnotationHub or ExperimentHub resources (Morgan ; Pasolli ).

4 Conclusion

The ggcyto package provides a powerful and unified visualization interface to complex, ungated or gated, annotated cytometry data structures and provides a key component of a reproducible research workflow. Specifically, the package allows for easy visualization of specific cytometry cell populations and gates, on the fly data and axis transformation, back-gating visualization and easy faceting by study metadata in order to explore variability in an experiment. User-friendliness is made possible through fuzzy name matching, lazy data loading and context-sensitive behavior that aims to capture ‘what the user means to do’ most frequently. Areas for future developments are highlighted in the Supplementary Material. Click here for additional data file.
  9 in total

1.  Using flowViz to visualize flow cytometry data.

Authors:  D Sarkar; N Le Meur; R Gentleman
Journal:  Bioinformatics       Date:  2008-02-01       Impact factor: 6.937

2.  Accessible, curated metagenomic data through ExperimentHub.

Authors:  Edoardo Pasolli; Lucas Schiffer; Paolo Manghi; Audrey Renson; Valerie Obenchain; Duy Tin Truong; Francesco Beghini; Faizan Malik; Marcel Ramos; Jennifer B Dowd; Curtis Huttenhower; Martin Morgan; Nicola Segata; Levi Waldron
Journal:  Nat Methods       Date:  2017-10-31       Impact factor: 28.547

Review 3.  Computational flow cytometry: helping to make sense of high-dimensional immunology data.

Authors:  Yvan Saeys; Sofie Van Gassen; Bart N Lambrecht
Journal:  Nat Rev Immunol       Date:  2016-06-20       Impact factor: 53.106

4.  Computational resources for high-dimensional immune analysis from the Human Immunology Project Consortium.

Authors:  Vladimir Brusic; Raphael Gottardo; Steven H Kleinstein; Mark M Davis
Journal:  Nat Biotechnol       Date:  2014-01-19       Impact factor: 54.908

Review 5.  Standardizing immunophenotyping for the Human Immunology Project.

Authors:  Holden T Maecker; J Philip McCoy; Robert Nussenblatt
Journal:  Nat Rev Immunol       Date:  2012-02-17       Impact factor: 53.106

6.  Bioconductor: open software development for computational biology and bioinformatics.

Authors:  Robert C Gentleman; Vincent J Carey; Douglas M Bates; Ben Bolstad; Marcel Dettling; Sandrine Dudoit; Byron Ellis; Laurent Gautier; Yongchao Ge; Jeff Gentry; Kurt Hornik; Torsten Hothorn; Wolfgang Huber; Stefano Iacus; Rafael Irizarry; Friedrich Leisch; Cheng Li; Martin Maechler; Anthony J Rossini; Gunther Sawitzki; Colin Smith; Gordon Smyth; Luke Tierney; Jean Y H Yang; Jianhua Zhang
Journal:  Genome Biol       Date:  2004-09-15       Impact factor: 13.583

7.  Extracting a cellular hierarchy from high-dimensional cytometry data with SPADE.

Authors:  Peng Qiu; Erin F Simonds; Sean C Bendall; Kenneth D Gibbs; Robert V Bruggner; Michael D Linderman; Karen Sachs; Garry P Nolan; Sylvia K Plevritis
Journal:  Nat Biotechnol       Date:  2011-10-02       Impact factor: 54.908

8.  Standardizing Flow Cytometry Immunophenotyping Analysis from the Human ImmunoPhenotyping Consortium.

Authors:  Greg Finak; Marc Langweiler; Maria Jaimes; Mehrnoush Malek; Jafar Taghiyar; Yael Korin; Khadir Raddassi; Lesley Devine; Gerlinde Obermoser; Marcin L Pekalski; Nikolas Pontikos; Alain Diaz; Susanne Heck; Federica Villanova; Nadia Terrazzini; Florian Kern; Yu Qian; Rick Stanton; Kui Wang; Aaron Brandes; John Ramey; Nima Aghaeepour; Tim Mosmann; Richard H Scheuermann; Elaine Reed; Karolina Palucka; Virginia Pascual; Bonnie B Blomberg; Frank Nestle; Robert B Nussenblatt; Ryan Remy Brinkman; Raphael Gottardo; Holden Maecker; J Philip McCoy
Journal:  Sci Rep       Date:  2016-02-10       Impact factor: 4.379

9.  Flow cytometry bioinformatics.

Authors:  Kieran O'Neill; Nima Aghaeepour; Josef Spidlen; Ryan Brinkman
Journal:  PLoS Comput Biol       Date:  2013-12-05       Impact factor: 4.475

  9 in total
  20 in total

1.  CytoML for cross-platform cytometry data sharing.

Authors:  Greg Finak; Wenxin Jiang; Raphael Gottardo
Journal:  Cytometry A       Date:  2018-12       Impact factor: 4.355

2.  Optochemical Control of TET Dioxygenases Enables Kinetic Insights into the Domain-Dependent Interplay of TET1 and MBD1 while Oxidizing and Reading 5-Methylcytosine.

Authors:  Tzu-Chen Lin; Shubhendu Palei; Daniel Summerer
Journal:  ACS Chem Biol       Date:  2022-06-16       Impact factor: 4.634

3.  An Opaque Cell-Specific Expression Program of Secreted Proteases and Transporters Allows Cell-Type Cooperation in Candida albicans.

Authors:  Matthew B Lohse; Lucas R Brenes; Naomi Ziv; Michael B Winter; Charles S Craik; Alexander D Johnson
Journal:  Genetics       Date:  2020-08-24       Impact factor: 4.562

Review 4.  Computational Analysis of Microbial Flow Cytometry Data.

Authors:  Peter Rubbens; Ruben Props
Journal:  mSystems       Date:  2021-01-19       Impact factor: 6.496

5.  Phage-delivered CRISPR-Cas9 for strain-specific depletion and genomic deletions in the gut microbiome.

Authors:  Kathy N Lam; Peter Spanogiannopoulos; Paola Soto-Perez; Margaret Alexander; Matthew J Nalley; Jordan E Bisanz; Renuka R Nayak; Allison M Weakley; Feiqiao B Yu; Peter J Turnbaugh
Journal:  Cell Rep       Date:  2021-11-02       Impact factor: 9.995

6.  Comprehensive Analysis of the Immune and Prognostic Implication of MMP14 in Lung Cancer.

Authors:  Chun-Long Zheng; Qiang Lu; Nian Zhang; Peng-Yu Jing; Ji-Peng Zhang; Wu-Ping Wang; Gui-Zhen Li
Journal:  Dis Markers       Date:  2021-11-24       Impact factor: 3.434

7.  CombiFlow: Flow cytometry-based identification and characterization of genetically and functionally distinct AML subclones.

Authors:  Roos Houtsma; Shanna M Hogeling; Jan Jacob Schuringa
Journal:  STAR Protoc       Date:  2021-09-27

8.  Computational flow cytometry of planktonic populations for the evaluation of microbiological-control programs in district cooling plants.

Authors:  J M W R McElhinney; A Mawart; R S S M Alkaabi; H S S Abdelsamad; A M Mansour; A Hasan
Journal:  Sci Rep       Date:  2020-08-06       Impact factor: 4.379

9.  Sickle-trait hemoglobin reduces adhesion to both CD36 and EPCR by Plasmodium falciparum-infected erythrocytes.

Authors:  Jens E V Petersen; Joseph W Saelens; Elizabeth Freedman; Louise Turner; Thomas Lavstsen; Rick M Fairhurst; Mahamadou Diakité; Steve M Taylor
Journal:  PLoS Pathog       Date:  2021-06-11       Impact factor: 6.823

10.  Comparative Proteomic Analysis within the Developmental Stages of the Mushroom White Hypsizygus marmoreus.

Authors:  Xiuqing Yang; Rongmei Lin; Kang Xu; Lizhong Guo; Hao Yu
Journal:  J Fungi (Basel)       Date:  2021-12-11
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.