Literature DB >> 19910367

GenomeRNAi: a database for cell-based RNAi phenotypes. 2009 update.

Moritz Gilsdorf1, Thomas Horn, Zeynep Arziman, Oliver Pelz, Evgeny Kiner, Michael Boutros.   

Abstract

The GenomeRNAi database (http://www.genomernai.org/) contains phenotypes from published cell-based RNA interference (RNAi) screens in Drosophila and Homo sapiens. The database connects observed phenotypes with annotations of targeted genes and information about the RNAi reagent used for the perturbation experiment. The availability of phenotypes from Drosophila and human screens also allows for phenotype searches across species. Besides reporting quantitative data from genome-scale screens, the new release of GenomeRNAi also enables reporting of data from microscopy experiments and curated phenotypes from published screens. In addition, the database provides an updated resource of RNAi reagents and their predicted quality that are available for the Drosophila and the human genome. The new version also facilitates the integration with other genomic data sets and contains expression profiling (RNA-Seq) data for several cell lines commonly used in RNAi experiments.

Entities:  

Mesh:

Substances:

Year:  2009        PMID: 19910367      PMCID: PMC2808900          DOI: 10.1093/nar/gkp1038

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

RNA interference (RNAi) is a post-transcriptional gene silencing mechanism conserved from plants to humans and relies on the delivery of exogenous short double-stranded (ds) RNAs as a trigger for the degradation of homologous mRNA in cells (1,2). RNAi is now widely used as an experimental tool to silence the expression of genes in a broad spectrum of organisms (3). The availability of RNAi libraries targeting almost every transcript in an organism’s genome has enabled researchers to query genomes for a broad spectrum of loss-of-function phenotypes in vitro and in vivo. Such RNAi screens play an increasingly important role for the identification and characterization of gene function. A crucial task is the integration and comparison of different RNAi screening datasets. For the 2007 version of GenomeRNAi (4) we collected and analyzed more than 91 000 long dsRNAs from different RNAi libraries targeting Drosophila transcripts and about 6100 published phenotype records from 29 large-scale studies in Drosophila cells. Here we present an updated version of the GenomeRNAi database that has been significantly extended by adding new Drosophila RNAi screens and reagents curated from the literature. In addition, we have incorporated RNAi reagents and phenotypic screens performed in human cells and RNA-Seq data from selected cell lines. The database now contains more than 99 700 phenotypic classifications for Drosophila genes and about 97 600 for human genes including information about the RNAi libraries used in these studies, such as sequence information, predicted specificity (5,6) and predicted efficiency (7). Also, the availability of data from RNAi screens in human cells now allows for the evaluation of RNAi phenotypes across species. The new version of GenomeRNAi incorporates the possibility to present phenotypes from image-based screens. The database and user interface were completely re-implemented gaining significant improvements in performance and handling of queries. The GenomeRNAi database can be accessed at http://www.genomernai.org/.

DATABASE CONTENT

The updated GenomeRNAi database integrates information about RNAi reagents, their annotated targets and phenotypic information based on large-scale RNAi screens in Drosophila and human tissue culture (Figure 1). It contains 118 443 RNAi reagents from seven libraries available for Drosophila [DRSC library from Boston (8), HFA (9) and BKN libraries from Heidelberg, libraries from Ambion, OpenBiosystems (10) and the MRC, in vivo library from the VDRC (11)] and 302 786 RNAi reagents from four human siRNA- or shRNA-based libraries (Ambion Silencer Select, Dharmacon/ThermoFisher siGENOME, Sigma-Aldrich TRC and Qiagen druggable/whole genome supplement). Reagents were computationally mapped onto the latest genomic sequence using BLAT (12) and Bowtie (13). Annotations for targeted genes and transcripts were derived through the mapping on genome and transcriptome databases. In addition predicted specificities (5,6) and efficiencies (7) as well as potential regions of low complexity, such as simple nucleotide repeats and tandem repeats of the trinucleotide CAN (N indicated any base) (5,6), where calculated for each reagent. All calculations were performed using a new design/evaluation pipeline, NEXT-RNAi (T. Horn, manuscript in preparation). NEXT-RNAi also generates output files for the Generic Genome Browser (GBrowse) (14,15) that are used for the visualization of the mapped reagents in their genomic context (as GBrowse ‘tracks’).
Figure 1.

Overview of GenomeRNAi database content. Phenotypes (from literature, quantitative data and high-content data), RNAi reagents, expression data as well as gene annotations were collected in a database (upper panel). GenomeRNAi allows queries for reagents, genes and phenotypes and provides corresponding outputs (lower panel).

Overview of GenomeRNAi database content. Phenotypes (from literature, quantitative data and high-content data), RNAi reagents, expression data as well as gene annotations were collected in a database (upper panel). GenomeRNAi allows queries for reagents, genes and phenotypes and provides corresponding outputs (lower panel). The database contains RNAi phenotypes from tissue culture screens in Drosophila and human cells. Phenotypes were manually curated from published supplemental material. For some Drosophila screens the data was downloaded from FlyRNAi (16). For each entry we tried to assign the RNAi reagent used, the targeted gene, scoring methods and thresholds, the final score and the observed phenotype. In addition other data was extracted from publications including e.g. cell type, readout type, assay, assay length, reagent type and reagent amount. We uploaded phenotypic data from all to-date published screens and also included large-scale datasets generated in our lab. The new version of GenomeRNAi now contains data from 97 genome-scale screens performed in Drosophila (including nine in vivo screens) and 48 genome-scale screens in human cells. In total, more than 197 000 phenotypic classifications are currently stored in GenomeRNAi (99 700 Drosophila, 97 600 human). A list of all currently available screens can be accessed through the ‘List all Screens’ link on the GenomeRNAi webpage. A new feature of GenomeRNAi is the presentation of data and phenotypes from image-based, high-content screens. To date, the database hosts images for a genome-wide morphology screen performed in human HeLa cells (F. Fuchs, manuscript in preparation) and images of knock downs of all Drosophila kinases and phosphatases in Drosophila S2 cells (T. Horn, unpublished data). The Drosophila set will be expanded to the full genome in the near future. Another new feature of the database is the cross-evaluation of RNAi phenotypes between Drosophila and Homo sapiens. Homology mappings were obtained from NCBI Homologene (17). This offers the opportunity to check whether a phenotype is conserved in Drosophila and human. As more comparable datasets become available (such as Wnt signaling pathway screens done in Drosophila and human cells) the value of interspecies comparisons is expected to increase further. GBrowse offers a versatile tool to visualize gene models and mappings of RNAi reagents to the genome. We integrated RNA-Seq data (T. Sandmann, unpublished data) from Drosophila S2 cells and from human HEK293T cells as wiggle-plots in GBrowse. Absence of detectable gene expression may indicate that observed phenotypes resulted from cross silencing (‘off-target’ silencing) of other genes. The new version of GenomeRNAi also contains GBrowse ‘tracks’ for predicted specificities and efficiencies of the complete Drosophila and human genomes.

DATA QUERY

The database can be queried by providing gene identifiers (NCBI, Entrez, Ensembl, FlyBase), RNAi reagent identifiers or phenotypes (Figure 2a). A list of all screens can be displayed via a direct link on the entry page that also allows accessing all phenotypes reported for a particular screen. Genes, RNAi reagents and phenotypes are linked to each other so that all kinds of queries allow accessing the other information available. Example-queries are also provided via links on the entry page. Advises how to query the database and further help can be obtained via the ‘Help’ link.
Figure 2.

Example of a database search for human COPB2. (a) The entry page allows for gene, reagent and phenotype queries or to ‘List all Screens’. Here COPB2 was queried. (b) The ‘Gene Info’ tab provides detailed information about the queried gene and linkouts to other sources. One Drosophila homolog, beta’Cop, was found (f). (c) List of all RNAi reagents available that target COPB2. The ‘Library’ link leads to more information about the RNAi library, the ‘Reagent Id link’ provides more information about the RNAi reagent. (d) Detailed information for the Qiagen siRNA pool SP00002881 containing four siRNA sequences. Sequence information, information about ‘On-’ and ‘Off-target’ hits defining the predicted specificity as well as the predicted efficiency are presented. All transcripts of the targeted gene covered by the siRNAs are listed, with the number of siRNA hits in braces. (e) All reported phenotypes are listed in the ‘Phenotype’ tab. Data from three quantitative viability assays is available for COPB2. ‘Score’ (z-score) and ‘Activity’ (activity normalized to negative controls) columns provide a measure for the phenotype strength and reproducibility (given by the standard deviation). Also images (raw, segmented and phenotypic classified) from one high-content screen performed in HeLa cells are available. Clicking on the thumbnail enlarges the images. COPB2 was also found causing phenotypes in four published screens. The column ‘Experiment’ assigns a short name to the screen. The link can be followed up to obtain detailed information about the experiment. The other columns show the utilized reagent, the ‘Score type’ and ‘Score Cutoff’ used for the analysis, the actual ‘Score’ and ‘Phenotype’. The column ‘Validated’ states whether a phenotype was retested (e.g. by a second RNAi reagent or by secondary assays). (f) Phenotypes for the Drosophila homolog beta’Cop. Data from two quantitative viability assays is presented. In addition data from 12 published screens is available.

Example of a database search for human COPB2. (a) The entry page allows for gene, reagent and phenotype queries or to ‘List all Screens’. Here COPB2 was queried. (b) The ‘Gene Info’ tab provides detailed information about the queried gene and linkouts to other sources. One Drosophila homolog, beta’Cop, was found (f). (c) List of all RNAi reagents available that target COPB2. The ‘Library’ link leads to more information about the RNAi library, the ‘Reagent Id link’ provides more information about the RNAi reagent. (d) Detailed information for the Qiagen siRNA pool SP00002881 containing four siRNA sequences. Sequence information, information about ‘On-’ and ‘Off-target’ hits defining the predicted specificity as well as the predicted efficiency are presented. All transcripts of the targeted gene covered by the siRNAs are listed, with the number of siRNA hits in braces. (e) All reported phenotypes are listed in the ‘Phenotype’ tab. Data from three quantitative viability assays is available for COPB2. ‘Score’ (z-score) and ‘Activity’ (activity normalized to negative controls) columns provide a measure for the phenotype strength and reproducibility (given by the standard deviation). Also images (raw, segmented and phenotypic classified) from one high-content screen performed in HeLa cells are available. Clicking on the thumbnail enlarges the images. COPB2 was also found causing phenotypes in four published screens. The column ‘Experiment’ assigns a short name to the screen. The link can be followed up to obtain detailed information about the experiment. The other columns show the utilized reagent, the ‘Score type’ and ‘Score Cutoff’ used for the analysis, the actual ‘Score’ and ‘Phenotype’. The column ‘Validated’ states whether a phenotype was retested (e.g. by a second RNAi reagent or by secondary assays). (f) Phenotypes for the Drosophila homolog beta’Cop. Data from two quantitative viability assays is presented. In addition data from 12 published screens is available.

DATA OUTPUT

For gene queries GenomeRNAi first provides detailed annotation information about the gene (from NCBI) including homology information (NCBI Homologene) in the ‘Gene Info’ tab (Figure 2b). To obtain more information about the gene, linkouts to other data source were implemented (including Entrez, NCBI RefSeq, HPRD, FlyBase). GenomeRNAi also summarizes all RNAi reagents available targeting the queried gene (tab ‘Reagents’, Figure 2c). The reagent links lead to more detailed information about the reagents (Figure 2d). The tab ‘Phenotypes’ (Figure 2e,f) lists three types of phenotypes: quantitative phenotypes, imaging phenotypes and literature-curated phenotypes. The ‘GBrowse’ tab (Supplementary Figure S1) shows the visualization of the gene model, the mapping of available RNAi reagents, plots for predicted specificities and efficiencies as well as available RNA-Seq data. An example of a database session is shown in Figure 2. Here, we searched for RNAi reagents and annotated phenotypes available for the human gene COPB2, a factor required for vesicle trafficking (18). Figure 2b shows the results screen with detailed gene information, including a link to the Drosophila homologs (beta’Cop, CG6699). The ‘Reagent’ tab (Figure 2c) lists RNAi reagents from three different siRNA libraries available to target this gene. Following the Qiagen ‘siRNA_Pool’ (SP00002881) link provides detailed information such as sequences, predicted specificity (‘On-target’ versus ‘Off-target’) and predicted efficiency as well as targeted transcripts [‘Transcripts (Hits)’] can be obtained (Figure 2d). The tab ‘Phenotypes’ (Figure 2e) shows viability data from three different cell lines (HeLa, HEK293T and HepG2). The ‘screen’-links provide more detailed information about the experiments. Knock down of COPB2 results in a viability defect in all three screens. The phenotype is quite severe in HeLa and HEK293T cells, where the ‘Activity’ (here viability) is below 10%. In HepG2 cells the phenotype is less strong with a remaining viability of about 70%. The images from the cell morphology screen in HeLa cells (‘Imaging Phenotypes’) support the phenotype as most cells died or show morphological defects. Additional support is provided by published experiments, e.g. COPB2 was found causing cell death in a ‘Genome stability’ assay. Going back to the ‘Gene Info’ tab and following the homology link to beta’Cop, similar phenotypes were found in Drosophila screens (Figure 2f). Both screens (in S2 cells from DGRC and S2-His2B-GFP cells) show a decreased viability after knock down of beta’Cop with a remaining viability of about 70% in both cell lines. In addition beta’Cop knock-down was found causing phenotypes in 12 other screens, such as a screens for viability and cell cycle as well as several ‘infection’ screens. The same database output can be obtained by direct searches for reagents or phenotypes.

CONCLUSION AND OUTLOOK

The GenomeRNAi database hosts RNAi phenotype information from large-scale studies in Drosophila and human cells connected with the underlying perturbation reagents and annotated target genes. Since genome annotations are in flux and RNAi reagents could exert unspecific effects, it is important to provide a regularly updated match of RNAi reagents to intended target genes. GenomeRNAi facilitates the evaluation of phenotypes at multiple levels. It provides the latest reagent annotations and internally calculates quality information, such as predicted specificity and efficiency. The information whether an observed phenotype was ‘validated’ (by retests with independent RNAi designs or other secondary assays) also contributes to the phenotype assessment. Furthermore the visualization of RNAi reagents in GBrowse uncovers design limitations such as incomplete coverage of annotated splice variants or designs biases (e.g. towards UTR regions) and provides the user with information about the expression of the targeted genes in several cell lines. The large number of screens and phenotypes hosted by the database reveals possible pleiotropic phenotypes of a candidate. Finally, the availability of data from two organisms enables cross-species comparisons of phenotypes, which will be extended by other organisms when screening becomes available. The focus of GenomeRNAi is to present phenotypes in the context of genomic information. The database is not limited to a single organism and reports quantitative, literature-curated and imaging data. Together with the underlying pipeline for mapping of RNAi reagents from different libraries and phenotypes it is unique compared to other RNAi databases (16,19). The integration and validation of data from primary publications is still a major issue due to the lack of standards on minimal information that need to be provided from large-scale screening approaches. There is also no general ontology to uniquely describe cellular phenotypes and literature reports vary in how phenotypes are described and what level of detail is provided. With GenomeRNAi we try to provide common terms for the ‘type’ of screen and the applied ‘assay’, which helps to identify and compare similar screens. To facilitate the cross-correlation of numerical phenotypic data and to document the screen-analysis-route we implemented the upload of data analyzed using the R/Bioconductor package cellHTS2 (20) for internal screens and plan to make this upload function available to users in the future. A screening dataset could then be analyzed and documented online using the implementation of cellHTS2 as web-tool (web-cellHTS; http://web-cellhts2.dkfz.de) before uploading the analyzed data to GenomeRNAi. The availability of in vivo RNAi libraries for Drosophila now enables genome-scale studies in the whole fly. Although the database already contains Drosophila in vivo phenotypes, more datasets will be added when available. In addition, the integration of RNAi phenotypes with other genomic data sets, such as RNA-Seq data, will be implemented in the future.

SUPPLEMENTARY DATA

Supplementary Data are available at NAR Online.

FUNDING

Fellowship of the German National Academic Foundation (to T.H.); fellowship by the MISTI’s MIT Germany Program (to E.K.); grants from the Helmholtz Alliance for Systems Biology and the Emmy Noether Program of the German Research Council (partial). Funding for open access charge: German Cancer Research Center. Conflict of interest statement. None declared.
  20 in total

1.  BLAT--the BLAST-like alignment tool.

Authors:  W James Kent
Journal:  Genome Res       Date:  2002-04       Impact factor: 9.043

2.  Genome-wide RNAi analysis of growth and viability in Drosophila cells.

Authors:  Michael Boutros; Amy A Kiger; Susan Armknecht; Kim Kerr; Marc Hild; Britta Koch; Stefan A Haas; Renato Paro; Norbert Perrimon
Journal:  Science       Date:  2004-02-06       Impact factor: 47.728

3.  The generic genome browser: a building block for a model organism system database.

Authors:  Lincoln D Stein; Christopher Mungall; ShengQiang Shu; Michael Caudy; Marco Mangone; Allen Day; Elizabeth Nickerson; Jason E Stajich; Todd W Harris; Adrian Arva; Suzanna Lewis
Journal:  Genome Res       Date:  2002-10       Impact factor: 9.043

4.  The Bioperl toolkit: Perl modules for the life sciences.

Authors:  Jason E Stajich; David Block; Kris Boulez; Steven E Brenner; Stephen A Chervitz; Chris Dagdigian; Georg Fuellen; James G R Gilbert; Ian Korf; Hilmar Lapp; Heikki Lehväslaiho; Chad Matsalla; Chris J Mungall; Brian I Osborne; Matthew R Pocock; Peter Schattner; Martin Senger; Lincoln D Stein; Elia Stupka; Mark D Wilkinson; Ewan Birney
Journal:  Genome Res       Date:  2002-10       Impact factor: 9.043

5.  Differential requirements for COPI transport during vertebrate early development.

Authors:  Pedro Coutinho; Michael J Parsons; Kevin A Thomas; Elizabeth M A Hirst; Leonor Saúde; Isabel Campos; P Huw Williams; Derek L Stemple
Journal:  Dev Cell       Date:  2004-10       Impact factor: 12.270

6.  Potent and specific genetic interference by double-stranded RNA in Caenorhabditis elegans.

Authors:  A Fire; S Xu; M K Montgomery; S A Kostas; S E Driver; C C Mello
Journal:  Nature       Date:  1998-02-19       Impact factor: 49.962

7.  Evidence of off-target effects associated with long dsRNAs in Drosophila melanogaster cell-based assays.

Authors:  Meghana M Kulkarni; Matthew Booker; Serena J Silver; Adam Friedman; Pengyu Hong; Norbert Perrimon; Bernard Mathey-Prevot
Journal:  Nat Methods       Date:  2006-10       Impact factor: 28.547

8.  FLIGHT: database and tools for the integration and cross-correlation of large-scale RNAi phenotypic datasets.

Authors:  David Sims; Borisas Bursteinas; Qiong Gao; Marketa Zvelebil; Buzz Baum
Journal:  Nucleic Acids Res       Date:  2006-01-01       Impact factor: 16.971

9.  FlyRNAi: the Drosophila RNAi screening center database.

Authors:  Ian Flockhart; Matthew Booker; Amy Kiger; Michael Boutros; Susan Armknecht; Nadire Ramadan; Kris Richardson; Andrew Xu; Norbert Perrimon; Bernard Mathey-Prevot
Journal:  Nucleic Acids Res       Date:  2006-01-01       Impact factor: 16.971

10.  Functional dissection of an innate immune response by a genome-wide RNAi screen.

Authors:  Edan Foley; Patrick H O'Farrell
Journal:  PLoS Biol       Date:  2004-06-22       Impact factor: 8.029

View more
  22 in total

1.  Bottlenecks caused by software gaps in miRNA and RNAi research.

Authors:  Sean Ekins; Ron Shigeta; Barry A Bunin
Journal:  Pharm Res       Date:  2012-02-24       Impact factor: 4.200

2.  The FLIGHT Drosophila RNAi database: 2010 update.

Authors:  David Sims; Borisas Bursteinas; Ekta Jain; Qiong Gao; Buzz Baum; Marketa Zvelebil
Journal:  Fly (Austin)       Date:  2010-10-01       Impact factor: 2.160

3.  Functional analysis of the putative integrin recognition motif on adeno-associated virus 9.

Authors:  Shen Shen; Garrett E Berry; Ruth M Castellanos Rivera; Roland Y Cheung; Andrew N Troupes; Sarah M Brown; Tal Kafri; Aravind Asokan
Journal:  J Biol Chem       Date:  2014-11-17       Impact factor: 5.157

4.  A genome-wide RNA interference screen identifies a differential role of the mediator CDK8 module subunits for GATA/ RUNX-activated transcription in Drosophila.

Authors:  Vanessa Gobert; Dani Osman; Stéphanie Bras; Benoit Augé; Muriel Boube; Henri-Marc Bourbon; Thomas Horn; Michael Boutros; Marc Haenlin; Lucas Waltzer
Journal:  Mol Cell Biol       Date:  2010-04-05       Impact factor: 4.272

Review 5.  Dissection of genetic pathways in C. elegans.

Authors:  Zheng Wang; David R Sherwood
Journal:  Methods Cell Biol       Date:  2011       Impact factor: 1.441

6.  Poly(A) signals located near the 5' end of genes are silenced by a general mechanism that prevents premature 3'-end processing.

Authors:  Jiannan Guo; Matthew Garrett; Gos Micklem; Saverio Brogna
Journal:  Mol Cell Biol       Date:  2010-12-06       Impact factor: 4.272

7.  Systematic analysis of RNAi reports identifies dismal commonality at gene-level and reveals an unprecedented enrichment in pooled shRNA screens.

Authors:  Bhavneet Bhinder; Hakim Djaballah
Journal:  Comb Chem High Throughput Screen       Date:  2013-11       Impact factor: 1.339

Review 8.  Integrating the multiple dimensions of genomic and epigenomic landscapes of cancer.

Authors:  Raj Chari; Kelsie L Thu; Ian M Wilson; William W Lockwood; Kim M Lonergan; Bradley P Coe; Chad A Malloff; Adi F Gazdar; Stephen Lam; Cathie Garnis; Calum E MacAulay; Carlos E Alvarez; Wan L Lam
Journal:  Cancer Metastasis Rev       Date:  2010-03       Impact factor: 9.264

Review 9.  Data mining for mutation-specific targets in acute myeloid leukemia.

Authors:  Brooks Benard; Andrew J Gentles; Thomas Köhnke; Ravindra Majeti; Daniel Thomas
Journal:  Leukemia       Date:  2019-02-06       Impact factor: 11.528

10.  TARGETgene: a tool for identification of potential therapeutic targets in cancer.

Authors:  Chia-Chin Wu; David D'Argenio; Shahab Asgharzadeh; Timothy Triche
Journal:  PLoS One       Date:  2012-08-31       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.