Literature DB >> 23740747

Pathway Processor 2.0: a web resource for pathway-based analysis of high-throughput data.

Luca Beltrame1, Luca Bianco, Paolo Fontana, Duccio Cavalieri.   

Abstract

SUMMARY: Pathway Processor 2.0 is a web application designed to analyze high-throughput datasets, including but not limited to microarray and next-generation sequencing, using a pathway centric logic. In addition to well-established methods such as the Fisher's test and impact analysis, Pathway Processor 2.0 offers innovative methods that convert gene expression into pathway expression, leading to the identification of differentially regulated pathways in a dataset of choice.
AVAILABILITY AND IMPLEMENTATION: Pathway Processor 2.0 is available as a web service at http://compbiotoolbox.fmach.it/pathwayProcessor/. Sample datasets to test the functionality can be used directly from the application. CONTACT: duccio.cavalieri@fmach.it SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Entities:  

Mesh:

Year:  2013        PMID: 23740747      PMCID: PMC3702260          DOI: 10.1093/bioinformatics/btt292

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 INTRODUCTION

The advance of high-throughput methods, particularly because of the advent of next-generation sequencing (NGS) provides an unprecedented amount of data from a single experiment. Although analysis method handling such data has considerably improved, comparing and especially integrating results between different platforms and analysis systems is still a challenge. For these reasons, there is a need for newer methods able to handle this type of data effectively and in a robust manner. In recent years, pathway-based approaches have emerged as a way to compare, integrate and interpret results from different ‘omics’ experiments (Beltrame ; Manoli ). In particular, pathway-based approaches can be useful methods to investigate complex phenotypes (Rizzetto , 2012). The main advantage of these methods is a greater ease of interpretation and increased comparability among different experiments, methodologies and platforms. Methods using pathways have evolved considerably as well, starting from the classical Fisher’s test (Grosu ), over more complex systems such as Gene Set Enrichment Analysis (Subramanian ) and impact analysis (Khatri ). In addition, newer methods have been proposed, which target RNA-seq datasets specifically in addition to microarrays: one of these is the Gene Set Variation Analysis (GSVA) (Haenzelmann ), which shifts the focus from gene expression to pathway expression through the generation of enrichment scores. Although methods like the Fisher’s test are commonly available in bioinformatics software, more advanced algorithms are often just part of customized software pipelines, and as such are not potentially useful to biologists. Also, a part of this software was developed with microarrays in mind, and while most algorithms are platform agnostic, often adapting them to newer technologies requires a considerable effort. Here, we describe Pathway Processor 2.0, a substantial upgrade over the original Pathway Processor introduced in 2002 (Grosu ). Developed as a web-based software, Pathway Processor 2.0 aims at using pathway-based approaches on omics data to extract meaningful and biologically sound information to support the biological hypothesis being tested. To do so, it offers well-established statistical methods in addition to a new method to calculate differential pathway expression between two user-supplied phenotypes.

2 IMPLEMENTATION

Pathway Processor 2.0 was implemented as a web-based service, using PHP for the graphical interface and for the analysis relies on a back-end written using R and Python. The back-end carries out the pre-processing, analysis and generation of final results, whereas the front-end handles the selection of analysis type, the input parameters and the display of input and results. Special care was taken to make the analysis back-end independent of the input platform and in fact Pathway Processor 2.0 supports any type of high-throughput data (microarray, RNAseq and proteomics). Pathway Processor 2.0 can analyze high-throughput data using four different methods, divided by the type of the input data and the possibility of using custom pathways (Table 1; Supplementary Data). The Fisher’s test is implemented as in the original Pathway Processor, but with some important improvements: in particular, multiple species are supported (currently human, mouse, rat, yeast and fruit fly) as long as the supplied identifiers for Differentially Expressed Genes (DEGs) and ‘gene universe’ (the whole list of genes on microarray chip or the complete list of genes of the organism under investigation) are in the correct format (Entrez Gene ID, RefSeq or gene symbols). Furthermore, in addition to the built-in KEGG pathway set, any custom gene set can be analyzed, by uploading an archive file containing the pathways (or gene sets) of interest. Visualization of genes over significant KEGG pathways is also possible (Supplementary Data). The user has to be aware that the statistical significance of the Fisher’s Exact Test is affected by the size and the connectivity of the gene set tested. Therefore, the results from this test have to be considered as a rapid and user-friendly way to discover the biological processes to be further investigated and verified experimentally.
Table 1.

Analysis methods available for Pathway Processor 2.0

Method nameInput typeCustom gene sets
Fisher’s testDEGs + gene universeYes
Impact analysisDEGs + gene universeNo
Gene ontologyDEGs + gene universeNo
GSVANormalized dataYes
Analysis methods available for Pathway Processor 2.0 The impact analysis method (Draghici ; Tarca ), which allows to determine activation or inhibition of pathways depending on the alteration of the genes involved and the topology of the pathways themselves, is implemented in Pathway Processor 2.0 using the improved version present in the ‘graphite' R package (Sales ), which provides a fully updated pathway model for KEGG, Reactome and the Pathway Interaction Database. Currently, impact analysis is limited to data from Homo sapiens only. Gene Ontology analysis is implemented through the R ‘topGO’ package, using a weighted algorithm over the whole Gene Ontology tree to select the significant affected nodes (Alexa ). In all cases, analysis parameters can be adjusted to fine-tune the results (Supplementary Data), and all data files produced can be downloaded from the web server for further analysis. Multiple test correction procedures are used to control the rate of false positives (Supplementary Data). The third analysis offered by Pathway Processor 2.0 is the application of the recently developed GSVA, coupled with linear models for differential expression analysis. This method, given a set of pathways and normalized gene expression data, allows the transformation of the data into pathway enrichment scores (a measure of the state of each pathway), generating a pathway expression matrix. This matrix is then used for a comparison of two user-supplied phenotypes of interest using moderated t-tests as implemented in the R package ‘limma’ (Smyth, 2005). The final result is a list of Differentially Regulated Pathways (DRPs; Supplementary Data) that can then be used to interpret data with a pathway-based view, providing more information in elucidating complex phenotypes. The GSVA matrix can also be downloaded, enabling its use in other downstream applications. All the analyses can be run on the web server after uploading the required files, without the need for any local installation of additional software or analysis tools. The software’s performance is in line with existing solutions (Supplementary Data).

3 CONCLUSIONS

Pathway Processor 2.0 is a useful tool to analyze ‘omics’ datasets, regardless of the platform that produced them, usable with both microarays and next-generation sequencing data. The web-based interface provides a one-stop shop to well-tested bioinformatic algorithms, and the new methods included in this software enable interpretation of the data with a true pathway-based view, allowing for deeper insight into complex biological problems.
  12 in total

1.  Pathway Processor: a tool for integrating whole-genome expression results into metabolic networks.

Authors:  Paul Grosu; Jeffrey P Townsend; Daniel L Hartl; Duccio Cavalieri
Journal:  Genome Res       Date:  2002-07       Impact factor: 9.043

2.  Improved scoring of functional groups from gene expression data by decorrelating GO graph structure.

Authors:  Adrian Alexa; Jörg Rahnenführer; Thomas Lengauer
Journal:  Bioinformatics       Date:  2006-04-10       Impact factor: 6.937

3.  A systems biology approach for pathway level analysis.

Authors:  Sorin Draghici; Purvesh Khatri; Adi Laurentiu Tarca; Kashyap Amin; Arina Done; Calin Voichita; Constantin Georgescu; Roberto Romero
Journal:  Genome Res       Date:  2007-09-04       Impact factor: 9.043

4.  Group testing for pathway analysis improves comparability of different microarray datasets.

Authors:  Theodora Manoli; Norbert Gretz; Hermann-Josef Gröne; Marc Kenzelmann; Roland Eils; Benedikt Brors
Journal:  Bioinformatics       Date:  2006-08-07       Impact factor: 6.937

5.  Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles.

Authors:  Aravind Subramanian; Pablo Tamayo; Vamsi K Mootha; Sayan Mukherjee; Benjamin L Ebert; Michael A Gillette; Amanda Paulovich; Scott L Pomeroy; Todd R Golub; Eric S Lander; Jill P Mesirov
Journal:  Proc Natl Acad Sci U S A       Date:  2005-09-30       Impact factor: 11.205

6.  Differential IL-17 production and mannan recognition contribute to fungal pathogenicity and commensalism.

Authors:  Lisa Rizzetto; Mirela Kuka; Carlotta De Filippo; Alessandra Cambi; Mihai G Netea; Luca Beltrame; Giorgio Napolitani; Maria Gabriella Torcia; Ugo D'Oro; Duccio Cavalieri
Journal:  J Immunol       Date:  2010-03-12       Impact factor: 5.422

7.  graphite - a Bioconductor package to convert pathway topology to gene network.

Authors:  Gabriele Sales; Enrica Calura; Duccio Cavalieri; Chiara Romualdi
Journal:  BMC Bioinformatics       Date:  2012-01-31       Impact factor: 3.169

Review 8.  Ten years of pathway analysis: current approaches and outstanding challenges.

Authors:  Purvesh Khatri; Marina Sirota; Atul J Butte
Journal:  PLoS Comput Biol       Date:  2012-02-23       Impact factor: 4.475

9.  GSVA: gene set variation analysis for microarray and RNA-seq data.

Authors:  Sonja Hänzelmann; Robert Castelo; Justin Guinney
Journal:  BMC Bioinformatics       Date:  2013-01-16       Impact factor: 3.169

10.  Using pathway signatures as means of identifying similarities among microarray experiments.

Authors:  Luca Beltrame; Lisa Rizzetto; Raffaele Paola; Philippe Rocca-Serra; Luca Gambineri; Cristina Battaglia; Duccio Cavalieri
Journal:  PLoS One       Date:  2009-01-06       Impact factor: 3.240

View more
  1 in total

1.  EXPath: a database of comparative expression analysis inferring metabolic pathways for plants.

Authors:  Chia-Hung Chien; Chi-Nga Chow; Nai-Yun Wu; Yi-Fan Chiang-Hsieh; Ping-Fu Hou; Wen-Chi Chang
Journal:  BMC Genomics       Date:  2015-01-21       Impact factor: 3.969

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.