Literature DB >> 23740750

Pathview: an R/Bioconductor package for pathway-based data integration and visualization.

Weijun Luo1, Cory Brouwer.   

Abstract

SUMMARY: Pathview is a novel tool set for pathway-based data integration and visualization. It maps and renders user data on relevant pathway graphs. Users only need to supply their data and specify the target pathway. Pathview automatically downloads the pathway graph data, parses the data file, maps and integrates user data onto the pathway and renders pathway graphs with the mapped data. Although built as a stand-alone program, Pathview may seamlessly integrate with pathway and functional analysis tools for large-scale and fully automated analysis pipelines. AVAILABILITY: The package is freely available under the GPLv3 license through Bioconductor and R-Forge. It is available at http://bioconductor.org/packages/release/bioc/html/pathview.html and at http://Pathview.r-forge.r-project.org/. CONTACT: luo_weijun@yahoo.com SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Entities:  

Mesh:

Year:  2013        PMID: 23740750      PMCID: PMC3702256          DOI: 10.1093/bioinformatics/btt285

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 INTRODUCTION

The pathway-based approach has been widely used in high-throughput data analysis (Emmert-Streib and Glazko, 2011; Kelder ; Khatri ). It has been successfully and routinely applied to gene expression (both microarray and RNA-Seq) (Luo ), genetic and GWAS (Wang ), proteomic and metabolomics data (Perroud ; Xia and Wishart, 2010). Compared with the individual gene/molecule-based approach, pathway analysis is more sensitive, consistent and informative (Luo ). R/Bioconductor has become a primary software environment for high-throughput data analysis and visualization (Gentleman ). Numerous pathway analysis methods and data types are implemented in R/Bioconductor, yet there has not been a dedicated and established tool for pathway-based data integration and visualization. In this article, we introduce a novel package called Pathview. We did a detailed comparison between Pathview and existing pathway tools in R/Bioconductor and other languages in Supplementary Table S2. Pathview provides three features that are rarely implemented well in other tools: (i) fully accessible and functional pathway visualization. It adheres to human readable pathway definitions and layouts like KEGG (Ogata ). No previous KEGG base tools provide full graphics, including node/edge attribute modifications, node/edge labels, legends and color keys. (ii) Strong data integration capacity. It integrates and works with data of different types (different omic levels, literature and so forth), IDs, formats, attributes, species and so forth. As far as we know, no other tool provides such extensive data mapping and integration support. (iii) Easy to automate and integrate with pathway analysis tools. Only a few tools can be directly automated and fully integrated into pathway analysis pipelines (Supplementary Table S2: automated analysis column). None of these features are brand new, but surprisingly, few of the existing tools provide satisfactory functionality in these aspects.

2 MAIN FEATURES

2.1 Overall design

The Pathview package can be divided into four functional modules: Downloader, Parser, Mapper and Viewer, as depicted in Supplementary Figure S1. Most importantly, Pathview maps and renders user data on relevant pathway graphs.

2.2 Data visualization

Pathview generates both native KEGG view (Fig. 1b) and Graphviz (Ellson ) view (Fig. 1a) for pathways. Both graph styles adhere to human readable pathway definition and layout, yet still allow proper modification and customization of node and edge attributes. KEGG view retains all pathway meta-data, i.e. spatial and temporal information, tissue/cell types, inputs, outputs and connections. This is important for readability and interpretation of pathway biology. Graphviz view provides better control of node and edge attributes, better view of pathway topology and better understanding of the pathway analysis statistics. The different workflows for these two types of view are merged in Pathview (Supplementary Fig. S1). This keeps the user interface simple and consistent.
Fig. 1.

Example Pathview graphs: (a) Graphviz view on a canonical signaling pathway (hsa04110 Cell cycle) with gene data only, (b) KEGG view on a metabolic pathway (hsa00640 Propanoate metabolism) with both discrete gene data and continuous metabolite data integrated. The same graphs at a higher resolution or with a different color scheme are shown in Supplementary Figures S3 and S4

Example Pathview graphs: (a) Graphviz view on a canonical signaling pathway (hsa04110 Cell cycle) with gene data only, (b) KEGG view on a metabolic pathway (hsa00640 Propanoate metabolism) with both discrete gene data and continuous metabolite data integrated. The same graphs at a higher resolution or with a different color scheme are shown in Supplementary Figures S3 and S4

2.3 Data integration

Pathview provides strong support for data integration (Supplementary Table S1). It can be used to integrate, analyze and visualize a wide variety of biological data: gene expression, protein expression, metabolite level, genetic association, genomic variation, literature record and other data types mappable to pathways. Notably, it can be directly used for metagenomic data when the data are mapped to KEGG ortholog pathways. The integrated Mapper module maps 12 types of gene or protein IDs, and 21 types of compound or metabolite related IDs to standard KEGG gene or compound IDs, and also maps between these external IDs. For other types of IDs (for instance, Affymetrix microarray probe set IDs) not included in the common ID lists, Pathview’s auxiliary functions will map user data to pathways when users provide the ID mapping data manually. Pathview applies to pathways for over 2000 species, and species can be specified in multiple formats: KEGG code, scientific name or common name. In addition, Pathview works with different data attributes and formats, both continuous and discrete data (Fig. 1b and Supplementary Table S1), either in matrix or vector format, with single or multiple samples/experiments and so forth.

2.4 Automated and integrated analysis

Pathview is open source, fully automated and error-resistant. Therefore, it seamlessly fits in integrated pathway or gene set analysis workflows. Pathview can be easily integrated with a wide variety of existing tools in or communicating to R/Bioconductor for high-throughput data analysis and pathway analysis. In the package vignette, we show an integrated analysis using Pathview with another Bioconductor package gage (Luo ). In automated pathway analysis, we frequently use heatmap, scatter plots or stacked line plots to view the perturbation patterns. These plots are simple and can be generated quickly in batches. However, they contain little information beyond the numeric changes. With Pathview, we can view molecular perturbations in intuitive and informative pathway contexts. Importantly, such graphs can be generated equally efficient as the classical scatter or line plots. This will greatly improve the analysis and interpretation of high-throughput molecular data. KEGG XML data files frequently contain minor deficiencies, inconsistent, incomplete or even error records because of manual curation. These deficiencies adversely affect the parsing, mapping and rendering processes and automation. Pathview accommodates these deficiencies, corrects them or skips the problematic pathway with warning. For example, Pathview Parser corrects for the improper KEGG definition of enzyme-compound interactions by merging and resolving the conflicting ECrel record and associated reactions records (Supplementary Fig. S2). In normal cases, Pathview uses KEGGgraph package (Zhang and Wiemann, 2009) to parse KEGG XML data files.

3 DISCUSSION AND CONCLUSION

Pathview maps and renders user data onto pathway graphs, which are intuitive, informative and well annotated. It integrates and works with a large variety of data types, IDs, formats and attributes. Pathview can be easily combined with other tools for automated and efficient pathway analysis pipelines. Currently, Pathview works with all types and species of KEGG pathways. We plan to support pathways from Reactome (Croft ), NCI Pathway Interaction and other databases based on needs in the future. Funding: UNC general administration (to W.L. and C.B.). Conflict of Interest: none declared.
  11 in total

Review 1.  Analysing biological pathways in genome-wide association studies.

Authors:  Kai Wang; Mingyao Li; Hakon Hakonarson
Journal:  Nat Rev Genet       Date:  2010-12       Impact factor: 53.242

2.  KEGG: Kyoto Encyclopedia of Genes and Genomes.

Authors:  H Ogata; S Goto; K Sato; W Fujibuchi; H Bono; M Kanehisa
Journal:  Nucleic Acids Res       Date:  1999-01-01       Impact factor: 16.971

3.  Finding the right questions: exploratory pathway analysis to enhance biological discovery in large datasets.

Authors:  Thomas Kelder; Bruce R Conklin; Chris T Evelo; Alexander R Pico
Journal:  PLoS Biol       Date:  2010-08-31       Impact factor: 8.029

4.  Reactome: a database of reactions, pathways and biological processes.

Authors:  David Croft; Gavin O'Kelly; Guanming Wu; Robin Haw; Marc Gillespie; Lisa Matthews; Michael Caudy; Phani Garapati; Gopal Gopinath; Bijay Jassal; Steven Jupe; Irina Kalatskaya; Shahana Mahajan; Bruce May; Nelson Ndegwa; Esther Schmidt; Veronica Shamovsky; Christina Yung; Ewan Birney; Henning Hermjakob; Peter D'Eustachio; Lincoln Stein
Journal:  Nucleic Acids Res       Date:  2010-11-09       Impact factor: 16.971

5.  Bioconductor: open software development for computational biology and bioinformatics.

Authors:  Robert C Gentleman; Vincent J Carey; Douglas M Bates; Ben Bolstad; Marcel Dettling; Sandrine Dudoit; Byron Ellis; Laurent Gautier; Yongchao Ge; Jeff Gentry; Kurt Hornik; Torsten Hothorn; Wolfgang Huber; Stefano Iacus; Rafael Irizarry; Friedrich Leisch; Cheng Li; Martin Maechler; Anthony J Rossini; Gunther Sawitzki; Colin Smith; Gordon Smyth; Luke Tierney; Jean Y H Yang; Jianhua Zhang
Journal:  Genome Biol       Date:  2004-09-15       Impact factor: 13.583

6.  Pathway analysis of expression data: deciphering functional building blocks of complex diseases.

Authors:  Frank Emmert-Streib; Galina V Glazko
Journal:  PLoS Comput Biol       Date:  2011-05-26       Impact factor: 4.475

Review 7.  Ten years of pathway analysis: current approaches and outstanding challenges.

Authors:  Purvesh Khatri; Marina Sirota; Atul J Butte
Journal:  PLoS Comput Biol       Date:  2012-02-23       Impact factor: 4.475

8.  Pathway analysis of kidney cancer using proteomics and metabolic profiling.

Authors:  Bertrand Perroud; Jinoo Lee; Nelly Valkova; Amy Dhirapong; Pei-Yin Lin; Oliver Fiehn; Dietmar Kültz; Robert H Weiss
Journal:  Mol Cancer       Date:  2006-11-24       Impact factor: 27.401

9.  GAGE: generally applicable gene set enrichment for pathway analysis.

Authors:  Weijun Luo; Michael S Friedman; Kerby Shedden; Kurt D Hankenson; Peter J Woolf
Journal:  BMC Bioinformatics       Date:  2009-05-27       Impact factor: 3.169

10.  KEGGgraph: a graph approach to KEGG PATHWAY in R and bioconductor.

Authors:  Jitao David Zhang; Stefan Wiemann
Journal:  Bioinformatics       Date:  2009-03-23       Impact factor: 6.937

View more
  566 in total

1.  Global transcriptomic changes occur in aged mouse podocytes.

Authors:  Yuliang Wang; Diana G Eng; Natalya V Kaverina; Carol J Loretz; Abbal Koirala; Shreeram Akilesh; Jeffrey W Pippin; Stuart J Shankland
Journal:  Kidney Int       Date:  2020-06-25       Impact factor: 10.612

2.  Increased longevity due to sexual activity in mole-rats is associated with transcriptional changes in the HPA stress axis.

Authors:  Steve Hoffmann; Karol Szafranski; Philip Dammann; Arne Sahm; Matthias Platzer; Philipp Koch; Yoshiyuki Henning; Martin Bens; Marco Groth; Hynek Burda; Sabine Begall; Saskia Ting; Moritz Goetz; Paul Van Daele; Magdalena Staniszewska; Jasmin Mona Klose; Pedro Fragoso Costa
Journal:  Elife       Date:  2021-03-16       Impact factor: 8.140

3.  Intermuscular adipose tissue directly modulates skeletal muscle insulin sensitivity in humans.

Authors:  Stephan Sachs; Simona Zarini; Darcy E Kahn; Kathleen A Harrison; Leigh Perreault; Tzu Phang; Sean A Newsom; Allison Strauss; Anna Kerege; Jonathan A Schoen; Daniel H Bessesen; Thomas Schwarzmayr; Elisabeth Graf; Dominik Lutter; Jan Krumsiek; Susanna M Hofmann; Bryan C Bergman
Journal:  Am J Physiol Endocrinol Metab       Date:  2019-01-08       Impact factor: 4.310

4.  Disruption of quercetin metabolism by fungicide affects energy production in honey bees (Apis mellifera).

Authors:  Wenfu Mao; Mary A Schuler; May R Berenbaum
Journal:  Proc Natl Acad Sci U S A       Date:  2017-02-13       Impact factor: 11.205

5.  Systematic reconstruction of autism biology from massive genetic mutation profiles.

Authors:  Weijun Luo; Chaolin Zhang; Yong-Hui Jiang; Cory R Brouwer
Journal:  Sci Adv       Date:  2018-04-11       Impact factor: 14.136

6.  Molecular mechanism of the TP53-MDM2-AR-AKT signalling network regulation by USP12.

Authors:  Urszula L McClurg; Nay C T H Chit; Mahsa Azizyan; Joanne Edwards; Arash Nabbi; Karl T Riabowol; Sirintra Nakjang; Stuart R McCracken; Craig N Robson
Journal:  Oncogene       Date:  2018-05-14       Impact factor: 9.867

7.  Filamentation Regulatory Pathways Control Adhesion-Dependent Surface Responses in Yeast.

Authors:  Jacky Chow; Izzy Starr; Sheida Jamalzadeh; Omar Muniz; Anuj Kumar; Omer Gokcumen; Denise M Ferkey; Paul J Cullen
Journal:  Genetics       Date:  2019-05-03       Impact factor: 4.562

8.  Ovarian transcriptome associated with reproductive senescence in the long-living Ames dwarf mice.

Authors:  Augusto Schneider; Scot J Matkovich; Tatiana Saccon; Berta Victoria; Lina Spinel; Mitra Lavasani; Andrzej Bartke; Pawel Golusinski; Michal M Masternak
Journal:  Mol Cell Endocrinol       Date:  2016-09-20       Impact factor: 4.102

9.  PPARγ Deficiency Suppresses the Release of IL-1β and IL-1α in Macrophages via a Type 1 IFN-Dependent Mechanism.

Authors:  Kassandra J Weber; Madeline Sauer; Li He; Eric Tycksen; Gowri Kalugotla; Babak Razani; Joel D Schilling
Journal:  J Immunol       Date:  2018-08-24       Impact factor: 5.422

10.  Fructose stimulated de novo lipogenesis is promoted by inflammation.

Authors:  Jelena Todoric; Giuseppe Di Caro; Saskia Reibe; Darren C Henstridge; Courtney R Green; Alison Vrbanac; Fatih Ceteci; Claire Conche; Reginald McNulty; Shabnam Shalapour; Koji Taniguchi; Peter J Meikle; Jeramie D Watrous; Rafael Moranchel; Mahan Najhawan; Mohit Jain; Xiao Liu; Tatiana Kisseleva; Maria T Diaz-Meco; Jorge Moscat; Rob Knight; Florian R Greten; Lester F Lau; Christian M Metallo; Mark A Febbraio; Michael Karin
Journal:  Nat Metab       Date:  2020-08-24
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.