Literature DB >> 21258062

HTSanalyzeR: an R/Bioconductor package for integrated network analysis of high-throughput screens.

Xin Wang1, Camille Terfve, John C Rose, Florian Markowetz.   

Abstract

MOTIVATION: High-throughput screens (HTS) by RNAi or small molecules are among the most promising tools in functional genomics. They enable researchers to observe detailed reactions to experimental perturbations on a genome-wide scale. While there is a core set of computational approaches used in many publications to analyze these data, a specialized software combining them and making them easily accessible has so far been missing.
RESULTS: Here we describe HTSanalyzeR, a flexible software to build integrated analysis pipelines for HTS data that contains over-representation analysis, gene set enrichment analysis, comparative gene set analysis and rich sub-network identification. HTSanalyzeR interfaces with commonly used pre-processing packages for HTS data and presents its results as HTML pages and network plots. AVAILABILITY: Our software is written in the R language and freely available via the Bioconductor project at http://www.bioconductor.org.

Entities:  

Mesh:

Year:  2011        PMID: 21258062      PMCID: PMC3051329          DOI: 10.1093/bioinformatics/btr028

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 INTRODUCTION

In recent years several technological advances have pushed gene perturbation screens to the forefront of functional genomics. Combining high-throughput screening (HTS) techniques with rich phenotypes enables researchers to observe detailed reactions to experimental perturbations on a genome-wide scale. This makes HTS one of the most promising tools in functional genomics. Although the phenotypes in HTS data mostly correspond to single genes, it becomes more and more important to analyze them in the context of cellular pathways and networks to understand how genes work together. Network analysis of HTS data depends on the dimensionality of the phenotypic readout (Markowetz, 2010). While specialized analysis approaches exist for high-dimensional phenotyping (e.g. Fröhlich ), analysis approaches for low-dimensional screens have so far been spread out over diverse softwares and online tools like DAVID (Huang ) or gene set enrichment analysis (GSEA; Subramanian ). Here we provide a software to build integrated analysis pipelines for HTS data that contain gene set and network analysis approaches commonly used in many papers (as reviewed by Markowetz, 2010). is implemented by S4 classes in R (R Development Core Team, 2009) and freely available via the Bioconductor project (Gentleman ). The example pipeline provided by interfaces directly with existing HTS pre-processing packages like cellHTS2 (Boutros ) or RNAither (Rieber ). Additionally, our software will be fully integrated in a web-interface for the analysis of HTS data (Pelz ) and thus be easily accessible to non-programmers.

2 AN INTEGRATED ANALYSIS PIPELINE FOR HIGH-THROUGHPUT SCREENING DATA

takes as input HTS data that has already undergone pre-processing and quality control (e.g. by using cellHTS2). It then functionally annotates the hits by gene set enrichment and network analysis approaches (see Fig. 1 for an overview).
Fig. 1.

takes as input HTS data that has already been pre-processed, normalized and quality checked, e.g. by cellHTS2. then combines the HTS data with gene sets and networks from freely available sources and performs three types of analysis: (i) hypergeometric tests for overlap between hits and gene sets; (ii) gene set enrichment analysis (GSEA) for concordant trends of a gene set in one phenotype; (iii) differential GSEA to identify gene sets with opposite trends in two phenotypes; and (iv) identification of subnetworks enriched for hits. The results are provided to the user as figures and HTML tables linked to external databases for annotation.

takes as input HTS data that has already been pre-processed, normalized and quality checked, e.g. by cellHTS2. then combines the HTS data with gene sets and networks from freely available sources and performs three types of analysis: (i) hypergeometric tests for overlap between hits and gene sets; (ii) gene set enrichment analysis (GSEA) for concordant trends of a gene set in one phenotype; (iii) differential GSEA to identify gene sets with opposite trends in two phenotypes; and (iv) identification of subnetworks enriched for hits. The results are provided to the user as figures and HTML tables linked to external databases for annotation. Gene set analysis: implements two approaches: (i) hypergeometric tests for surprising overlap between hits and gene sets, and (ii) gene set enrichment analysis to measure if a gene set shows a concordant trend to stronger phenotypes. uses gene sets from MSigDB (Subramanian ), Gene Ontology (Ashburner ), KEGG (Kanehisa ) and others. The accompanying vignette explains how user-defined gene sets can easily be included. Results are visualized as an enrichment map (Merico ). Network analysis: In a complementary approach strong hits are mapped to a network and enriched subnetworks are identified. Networks can come from different sources, especially protein interaction networks are often used. In we use networks defined in the BioGRID database (Stark ), but other user-defined networks can easily be included in the analysis. To identify rich subnetworks, we use the BioNet package (Beisser ), which in its heuristic version is fast and produces close-to-optimal results. Comparing phenotypes: A goal we expect to become more and more important in the future is to compare phenotypes for the same genes in different cellular conditions. supports comparative analyses for gene sets and networks. Differentially enriched gene sets are computed by comparing GSEA enrichment scores or alternatively by a Wilcoxon test statistic. Subnetworks rich for more than one phenotype can be found with BioNet (Beisser ).

3 CORE CLASSES AND METHODS

The two core S4 classes in are ‘GSCA’ (Gene Set Collection Analysis) and ‘NWA’ (NetWork Analysis). S4 methods for both classes cover the following functions: Preprocessing: The S4 methods ‘preprocess’ reformat the input data, e.g. by removing duplicated genes and converting annotations to Entrez identifiers. This step makes the objects of class ‘GSCA’ and ‘NWA’ ready for the following analyses. Analyses: The S4 methods ‘analyze’ are provided for gene set and network analyses. Each method depends on several input parameters which can be defined by the user. also implements a standard analysis option using default parameters that we have found to work well in many applications. Visualization: GSEA random walks, enrichment maps and rich subnetworks can be viewed by S4 methods ‘viewGSEA’, ‘viewEnrichMap’ and ‘viewSubNet’, respectively. Reporting: The analyses results of class ‘GSCA’ and ‘NWA’ can be reported seperately or together to HTML files using the S4 methods ‘report’ and ‘reportAll’, respectively. The output format was inspired by cellHTS2 and contains network figures as well as tables linked to external databases.
  13 in total

1.  Gene ontology: tool for the unification of biology. The Gene Ontology Consortium.

Authors:  M Ashburner; C A Ball; J A Blake; D Botstein; H Butler; J M Cherry; A P Davis; K Dolinski; S S Dwight; J T Eppig; M A Harris; D P Hill; L Issel-Tarver; A Kasarskis; S Lewis; J C Matese; J E Richardson; M Ringwald; G M Rubin; G Sherlock
Journal:  Nat Genet       Date:  2000-05       Impact factor: 38.330

2.  Analyzing gene perturbation screens with nested effects models in R and bioconductor.

Authors:  Holger Fröhlich; Tim Beissbarth; Achim Tresch; Dennis Kostka; Juby Jacob; Rainer Spang; F Markowetz
Journal:  Bioinformatics       Date:  2008-08-21       Impact factor: 6.937

3.  BioNet: an R-Package for the functional analysis of biological networks.

Authors:  Daniela Beisser; Gunnar W Klau; Thomas Dandekar; Tobias Müller; Marcus T Dittrich
Journal:  Bioinformatics       Date:  2010-02-25       Impact factor: 6.937

4.  Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles.

Authors:  Aravind Subramanian; Pablo Tamayo; Vamsi K Mootha; Sayan Mukherjee; Benjamin L Ebert; Michael A Gillette; Amanda Paulovich; Scott L Pomeroy; Todd R Golub; Eric S Lander; Jill P Mesirov
Journal:  Proc Natl Acad Sci U S A       Date:  2005-09-30       Impact factor: 11.205

5.  How to understand the cell by breaking it: network analysis of gene perturbation screens.

Authors:  Florian Markowetz
Journal:  PLoS Comput Biol       Date:  2010-02-26       Impact factor: 4.475

6.  Enrichment map: a network-based method for gene-set enrichment visualization and interpretation.

Authors:  Daniele Merico; Ruth Isserlin; Oliver Stueker; Andrew Emili; Gary D Bader
Journal:  PLoS One       Date:  2010-11-15       Impact factor: 3.240

7.  Bioconductor: open software development for computational biology and bioinformatics.

Authors:  Robert C Gentleman; Vincent J Carey; Douglas M Bates; Ben Bolstad; Marcel Dettling; Sandrine Dudoit; Byron Ellis; Laurent Gautier; Yongchao Ge; Jeff Gentry; Kurt Hornik; Torsten Hothorn; Wolfgang Huber; Stefano Iacus; Rafael Irizarry; Friedrich Leisch; Cheng Li; Martin Maechler; Anthony J Rossini; Gunther Sawitzki; Colin Smith; Gordon Smyth; Luke Tierney; Jean Y H Yang; Jianhua Zhang
Journal:  Genome Biol       Date:  2004-09-15       Impact factor: 13.583

8.  web cellHTS2: a web-application for the analysis of high-throughput screening data.

Authors:  Oliver Pelz; Moritz Gilsdorf; Michael Boutros
Journal:  BMC Bioinformatics       Date:  2010-04-12       Impact factor: 3.169

9.  BioGRID: a general repository for interaction datasets.

Authors:  Chris Stark; Bobby-Joe Breitkreutz; Teresa Reguly; Lorrie Boucher; Ashton Breitkreutz; Mike Tyers
Journal:  Nucleic Acids Res       Date:  2006-01-01       Impact factor: 16.971

10.  Analysis of cell-based RNAi screens.

Authors:  Michael Boutros; Lígia P Brás; Wolfgang Huber
Journal:  Genome Biol       Date:  2006       Impact factor: 13.583

View more
  68 in total

1.  A recurrent 11q aberration pattern characterizes a subset of MYC-negative high-grade B-cell lymphomas resembling Burkitt lymphoma.

Authors:  Itziar Salaverria; Idoia Martin-Guerrero; Rabea Wagener; Markus Kreuz; Christian W Kohler; Julia Richter; Barbara Pienkowska-Grela; Patrick Adam; Birgit Burkhardt; Alexander Claviez; Christine Damm-Welk; Hans G Drexler; Michael Hummel; Elaine S Jaffe; Ralf Küppers; Christine Lefebvre; Jasmin Lisfeld; Markus Löffler; Roderick A F Macleod; Inga Nagel; Ilske Oschlies; Maciej Rosolowski; Robert B Russell; Grzegorz Rymkiewicz; Detlev Schindler; Matthias Schlesner; René Scholtysik; Carsten Schwaenen; Rainer Spang; Monika Szczepanowski; Lorenz Trümper; Inga Vater; Swen Wessendorf; Wolfram Klapper; Reiner Siebert
Journal:  Blood       Date:  2014-01-07       Impact factor: 22.113

2.  Immune Escape in Breast Cancer During In Situ to Invasive Carcinoma Transition.

Authors:  Carlos R Gil Del Alcazar; Sung Jin Huh; Muhammad B Ekram; Anne Trinh; Lin L Liu; Francisco Beca; Xiaoyuan Zi; Minsuk Kwak; Helga Bergholtz; Ying Su; Lina Ding; Hege G Russnes; Andrea L Richardson; Kirsten Babski; Elizabeth Min Hui Kim; Charles H McDonnell; Jon Wagner; Ron Rowberry; Gordon J Freeman; Deborah Dillon; Therese Sorlie; Lisa M Coussens; Judy E Garber; Rong Fan; Kristie Bobolis; D Craig Allred; Joon Jeong; So Yeon Park; Franziska Michor; Kornelia Polyak
Journal:  Cancer Discov       Date:  2017-06-26       Impact factor: 39.397

3.  Critical period plasticity-related transcriptional aberrations in schizophrenia and bipolar disorder.

Authors:  Milo R Smith; Ben Readhead; Joel T Dudley; Hirofumi Morishita
Journal:  Schizophr Res       Date:  2018-11-12       Impact factor: 4.939

4.  PRMT2 and RORγ expression are associated with breast cancer survival outcomes.

Authors:  Tae Gyu Oh; Peter Bailey; Eloise Dray; Aaron G Smith; Joel Goode; Natalie Eriksson; John W Funder; Peter J Fuller; Evan R Simpson; Wayne D Tilley; Peter J Leedman; Christine L Clarke; Sean Grimmond; Dennis H Dowhan; George E O Muscat
Journal:  Mol Endocrinol       Date:  2014-06-09

5.  Motivational, proteostatic and transcriptional deficits precede synapse loss, gliosis and neurodegeneration in the B6.HttQ111/+ model of Huntington's disease.

Authors:  Robert M Bragg; Sydney R Coffey; Rory M Weston; Seth A Ament; Jeffrey P Cantle; Shawn Minnig; Cory C Funk; Dominic D Shuttleworth; Emily L Woods; Bonnie R Sullivan; Lindsey Jones; Anne Glickenhaus; John S Anderson; Michael D Anderson; Stephen B Dunnett; Vanessa C Wheeler; Marcy E MacDonald; Simon P Brooks; Nathan D Price; Jeffrey B Carroll
Journal:  Sci Rep       Date:  2017-02-08       Impact factor: 4.379

6.  Comprehensive analysis of high-throughput screens with HiTSeekR.

Authors:  Markus List; Steffen Schmidt; Helle Christiansen; Marc Rehmsmeier; Qihua Tan; Jan Mollenhauer; Jan Baumbach
Journal:  Nucleic Acids Res       Date:  2016-06-21       Impact factor: 16.971

7.  Increased expression of c-Jun in nonalcoholic fatty liver disease.

Authors:  Christoph Dorn; Julia C Engelmann; Michael Saugspier; Andreas Koch; Arndt Hartmann; Martina Müller; Rainer Spang; Anja Bosserhoff; Claus Hellerbrand
Journal:  Lab Invest       Date:  2014-02-03       Impact factor: 5.662

8.  Transcriptome Profiling of Adipose Tissue Reveals Depot-Specific Metabolic Alterations Among Patients with Colorectal Cancer.

Authors:  Mariam Haffa; Andreana N Holowatyj; Mario Kratz; Reka Toth; Axel Benner; Biljana Gigic; Nina Habermann; Petra Schrotz-King; Jürgen Böhm; Hermann Brenner; Martin Schneider; Alexis Ulrich; Esther Herpel; Peter Schirmacher; Beate K Straub; Johanna Nattenmüller; Hans-Ulrich Kauczor; Tengda Lin; Claudia R Ball; Cornelia M Ulrich; Hanno Glimm; Dominique Scherer
Journal:  J Clin Endocrinol Metab       Date:  2019-11-01       Impact factor: 5.958

9.  NetMix: A Network-Structured Mixture Model for Reduced-Bias Estimation of Altered Subnetworks.

Authors:  Matthew A Reyna; Uthsav Chitra; Rebecca Elyanow; Benjamin J Raphael
Journal:  J Comput Biol       Date:  2021-01-05       Impact factor: 1.479

10.  Highly connected, non-redundant microRNA functional control in breast cancer molecular subtypes.

Authors:  Guillermo de Anda-Jáuregui; Jesús Espinal-Enríquez; Enrique Hernández-Lemus
Journal:  Interface Focus       Date:  2021-06-11       Impact factor: 3.906

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.