Literature DB >> 15608162

siRNAdb: a database of siRNA sequences.

Alistair M Chalk1, Richard E Warfinge, Patrick Georgii-Hemming, Erik L L Sonnhammer.   

Abstract

Short interfering RNAs (siRNAs) are a popular method for gene-knockdown, acting by degrading the target mRNA. Before performing experiments it is invaluable to locate and evaluate previous knockdown experiments for the gene of interest. The siRNA database provides a gene-centric view of siRNA experimental data, including siRNAs of known efficacy and siRNAs predicted to be of high efficacy by a combination of methods. Linked to these sequences is information such as siRNA thermodynamic properties and the potential for sequence-specific off-target effects. The database enables the user to evaluate an siRNA's potential for inhibition and non-specific effects. The database is available at http://siRNA.cgb.ki.se.

Entities:  

Mesh:

Substances:

Year:  2005        PMID: 15608162      PMCID: PMC540090          DOI: 10.1093/nar/gki136

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

Short interfering RNAs (siRNAs) enable the inhibition of single genes at the nucleotide level. They are duplexes of two RNA molecules, typically 21mers with a 2 nt 3′ overhang (1). A particular strength of this method of knockdown is that an siRNA can be designed to inhibit the expression of any mRNA, and thus the protein it encodes. The knockdown approach, unlike a knockout, allows detailed study of the effects of reducing a gene's expression to none for a period of time, and then allowing its expression to return to normal. This effect can be demonstrated without affecting related proteins, making it an invaluable tool for functional genomics. siRNAs have been found to be effective in Arabidopsis thaliana, Drosophila melanogaster, Caenorhabditis elegans and mammals (2). For an in-depth review of the subject see e.g. (2,3). With the increased utilization of siRNAs it is essential to keep track of siRNAs that have been published. Experimentalists wish to easily be able to find siRNAs that have already been verified for their target gene. If such siRNAs exist, the researcher will be interested in the conditions under which the siRNA was tested, and the reference to the article where the siRNA was published for further investigation. If no such siRNAs exist then the researcher probably wants to choose an siRNA designed using one of a number of recently published methods (4–8). In both cases it is important to identify potential sequence-specific off-target effects of the siRNA. An additional user group for an siRNA database consists of bioinformaticians. In this case the primary interest is the underlying data, which can be downloaded for subsequent analysis and the building of predictive models.

THE DATABASE

Contents

The database contains information about siRNA molecules from two sources: (i) siRNAs collected from the literature that have experimentally verified efficacy and (ii) siRNAs selected computationally to target the REFSEQ (9) curated human gene set (20 410 ‘NM’ sequences and 6767 ‘XM’ sequences). The database holds experimental information gathered from literature for siRNAs in set (i). This includes efficacy, cell type, efficacy assay and information about the target gene. When exact figures for efficacy are unavailable we approximate the value; these values are marked with a type (validated, predicted, approximated or generalized) to indicate the method used to determine the efficacy value. Detailed descriptions of these types are available online. For genes with no experimentally verified siRNA in the database, we provide a set of predictions using the following combination of prediction methods. siRNAs were selected only if the siSearch (6) score exceeded 5, the Reynolds (10) score exceeded 5, and the Ui-tei (11) score was Ia or Ib. This set of predicted siRNAs was then subjected to a BLAST specificity search against the REFSEQ database, and siRNAs were retained only if they had no matches to other genes (16 or more consecutive bases). A link to PubMed with a pre-formulated siRNA query search is also made available to allow the user to easily check for new siRNA articles relating to the gene of interest literature. In release 1.0 there are 500 experimentally verified siRNAs targeting 115 genes. These data were gathered from 55 articles. Since siRNAs are being continuously added to the database in an ongoing manner we recommend checking the server for the latest release information. The distribution of siRNAs per gene is not flat; some genes have a large number of data points, while many contain only a few. The distribution of the efficacies of the siRNAs is shown in Figure 1. Of the 500 siRNAs, 12.8% give knockdown efficacy >90% while 55.8% give efficacy >50%. The experiments use either nucleotide or protein expression levels to measure efficacy; 297 use nucleotide and 198 use protein levels. A number of different transfection reagents were used, with lipofectamine 2000 being the most commonly used (70% of cases) reagent. A total of 109 001 siRNAs (from 21 075 genes) matched the prediction criteria specified in the previous section. Of those, 42 155 siRNAs from 12 888 genes were also found to be specific using BLAST criteria; 14 189 genes have no siRNAs matching these criteria, owing to the strict requirements for the automated predictions. If siRNAdb lacks predictions for a gene it is recommended that the user manually search for siRNA sequences using one, or a combination of siRNA prediction servers.
Figure 1

Efficacy level distribution in siRNAdb.

The database interface

The database interface was designed with the experimental user group as the primary target audience. The user interface is gene-centric, allowing the user to search by nucleotide accession number, free text, sequence or by viewing the list of genes with verified siRNAs. The layout of the database is straightforward and requires only brief description. For each gene a summary of the siRNAs are shown with links to more detailed information (see Figure 2). The following list illustrates the organization of the data as viewed within siRNAdb. Click on the links below to open the relevant help pages with more verbose documentation than is required here.
Figure 2

View of Human AKT1 siRNA data. (a) Gene view and (b) energy profiles for a single siRNA. The first curve shows di-nucleotide binding energy values, as calculated using the method of Mathews et al. (14). The straight horizontal lines represent the binding energy at each end as calculated by Schwarz et al. (15). The second set of curves represents free energy profiles calculated using the method of Khvorova et al. (16). The black curve is that of the current siRNA. The additional green and red curves are the averaged reference values for best and worst siRNAs, respectively. All curves are calculated from antisense 5′ → 3′, which is right → left in this display.

Searching the database A gene record Individual siRNA records Data specific to verified siRNA Data specific to predicted siRNA Comparison of search methods (AOsearch, BLAST) Definitions Submit siRNA data to our database Downloading the database

siRNA calculations

A multitude of factors have been identified as being important for governing siRNA efficacy. We calculate and display these factors. Summary statistics are self-evident, and energy profiles are described in (6). The energy data displayed includes the ‘start’ and ‘end’ energies, representing the strength of binding at each end of the siRNA.

Implementation

The database is implemented in MySQL version 4. The central table in the database is called siRNA, and contains information about the siRNA such as sense and antisense sequences, overhangs and target sequence. The second most important table in the database is the Experiment table that contains a list of all experiments performed on the siRNAs as well as references to PubMed. Efficacies for an experiment are stored either as ‘validated’ or ‘predicted’ to distinguish these types. Each siRNA can have multiple experiments attached to them, as several experiments can be performed using the same siRNA sequence. The querying of the underlying SQL database is implemented using Java servlets running on an Apache Tomcat server.

Sequence-specific off target effects

Non-specific off-target effects caused by siRNAs matching genes other than their intended target gene render experimental results hard to interpret or useless. It is essential that siRNAs are designed correctly to take this problem into account. We use two methods for calculating potential sequence-specific off-target effects. For experimentally verified siRNAs we search using AOsearch (http://aosearch.cgb.ki.se) to look for hits with 0–2 mismatches, combined with BLAST. AOsearch uses inexact pattern matching with AGREP (12) that is more sensitive than BLAST for short sequence searches with mismatches. For predicted siRNAs searching with AOsearch is too computationally expensive, hence we used BLAST to identify matches with 16 bp in common. We hope to have AOsearch results incorporated into the server soon, however. For a comparison of exact matching methods versus BLAST (13).

Quality control

We only collect data from other sources and we do not attempt to evaluate the entries ourselves. It is assumed that the experimental verification claimed by the authors of siRNA experiments is correct, and that suitable controls were used to ensure this. By displaying limited experimental information and providing a link to the source article, we provide the user with resources to evaluate the quality of the siRNA.

Submission

In order to maintain an up-to-date resource we encourage experimentalists to submit their siRNA data directly to us as soon as their paper is published. We accept user submissions of siRNA sequences or publications via email. Such submissions are manually checked before addition to the database.

CONCLUSIONS/FUTURE PERSPECTIVES

The database is a collection of siRNA experiments. It was designed to assist experimentalists in determining which siRNA to use to inhibit their gene of interest. As more siRNAs are verified this database will become increasingly useful for developing siRNA design tools. One future plan is to complete a genome-wide siRNA set for the mouse; where humanmouse orthologs are identical, the same siRNA may be used to target both genes. The database was designed to hold results from a number of different prediction methods, and we invite siRNA prediction groups to submit their predictions to the database.

AVAILABILITY

The database implementation and verified siRNAs are available at http://siRNA.cgb.ki.se for non-commercial use. The experimentally verified section of the database is available for download. For-profit organizations are requested to contact the corresponding author.
  15 in total

1.  RNA interference is mediated by 21- and 22-nucleotide RNAs.

Authors:  S M Elbashir; W Lendeckel; T Tuschl
Journal:  Genes Dev       Date:  2001-01-15       Impact factor: 11.361

2.  Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure.

Authors:  D H Mathews; J Sabina; M Zuker; D H Turner
Journal:  J Mol Biol       Date:  1999-05-21       Impact factor: 5.469

3.  Asymmetry in the assembly of the RNAi enzyme complex.

Authors:  Dianne S Schwarz; György Hutvágner; Tingting Du; Zuoshang Xu; Neil Aronin; Phillip D Zamore
Journal:  Cell       Date:  2003-10-17       Impact factor: 41.582

4.  Functional siRNAs and miRNAs exhibit strand bias.

Authors:  Anastasia Khvorova; Angela Reynolds; Sumedha D Jayasena
Journal:  Cell       Date:  2003-10-17       Impact factor: 41.582

5.  Rational siRNA design for RNA interference.

Authors:  Angela Reynolds; Devin Leake; Queta Boese; Stephen Scaringe; William S Marshall; Anastasia Khvorova
Journal:  Nat Biotechnol       Date:  2004-02-01       Impact factor: 54.908

6.  Guidelines for the selection of highly effective siRNA sequences for mammalian and chick RNA interference.

Authors:  Kumiko Ui-Tei; Yuki Naito; Fumitaka Takahashi; Takeshi Haraguchi; Hiroko Ohki-Hamazaki; Aya Juni; Ryu Ueda; Kaoru Saigo
Journal:  Nucleic Acids Res       Date:  2004-02-09       Impact factor: 16.971

7.  NCBI Reference Sequence project: update and current status.

Authors:  Kim D Pruitt; Tatiana Tatusova; Donna R Maglott
Journal:  Nucleic Acids Res       Date:  2003-01-01       Impact factor: 16.971

Review 8.  Gene silencing in mammals by small interfering RNAs.

Authors:  Michael T McManus; Phillip A Sharp
Journal:  Nat Rev Genet       Date:  2002-10       Impact factor: 53.242

9.  Many commonly used siRNAs risk off-target activity.

Authors:  Ola Snøve; Torgeir Holen
Journal:  Biochem Biophys Res Commun       Date:  2004-06-18       Impact factor: 3.575

10.  An algorithm for selection of functional siRNA sequences.

Authors:  Mohammed Amarzguioui; Hans Prydz
Journal:  Biochem Biophys Res Commun       Date:  2004-04-16       Impact factor: 3.575

View more
  19 in total

Review 1.  Toward a complete in silico, multi-layered embryonic stem cell regulatory network.

Authors:  Huilei Xu; Christoph Schaniel; Ihor R Lemischka; Avi Ma'ayan
Journal:  Wiley Interdiscip Rev Syst Biol Med       Date:  2010 Nov-Dec

Review 2.  MicroRNAs and cardiac pathology.

Authors:  Michael V G Latronico; Gianluigi Condorelli
Journal:  Nat Rev Cardiol       Date:  2009-06       Impact factor: 32.419

Review 3.  The construction of transgenic and gene knockout/knockin mouse models of human disease.

Authors:  Alfred Doyle; Michael P McGarry; Nancy A Lee; James J Lee
Journal:  Transgenic Res       Date:  2011-07-29       Impact factor: 2.788

4.  In Silico Methods for the Identification of Viral-Derived Small Interfering RNAs (vsiRNAs) and Their Application in Plant Genomics.

Authors:  Aditya Narayan; Shafaque Zahra; Ajeet Singh; Shailesh Kumar
Journal:  Methods Mol Biol       Date:  2022

Review 5.  Databases and resources for human small non-coding RNAs.

Authors:  Eneritz Agirre; Eduardo Eyras
Journal:  Hum Genomics       Date:  2011-03       Impact factor: 4.639

6.  DNAzyme-mediated recovery of small recombinant RNAs from a 5S rRNA-derived chimera expressed in Escherichia coli.

Authors:  Yamei Liu; Victor G Stepanov; Ulrich Strych; Richard C Willson; George W Jackson; George E Fox
Journal:  BMC Biotechnol       Date:  2010-12-06       Impact factor: 2.563

7.  HIVsirDB: a database of HIV inhibiting siRNAs.

Authors:  Atul Tyagi; Firoz Ahmed; Nishant Thakur; Arun Sharma; Gajendra P S Raghava; Manoj Kumar
Journal:  PLoS One       Date:  2011-10-11       Impact factor: 3.240

8.  Variables and strategies in development of therapeutic post-transcriptional gene silencing agents.

Authors:  Jack M Sullivan; Edwin H Yau; Tiffany A Kolniak; Lowell G Sheflin; R Thomas Taggart; Heba E Abdelmaksoud
Journal:  J Ophthalmol       Date:  2011-06-30       Impact factor: 1.909

9.  VIRsiRNAdb: a curated database of experimentally validated viral siRNA/shRNA.

Authors:  Nishant Thakur; Abid Qureshi; Manoj Kumar
Journal:  Nucleic Acids Res       Date:  2011-12-01       Impact factor: 16.971

10.  Asymmetrically designed siRNAs and shRNAs enhance the strand specificity and efficacy in RNAi.

Authors:  Hongliu Ding; Guoqing Liao; Hongyan Wang; Yejin Zhou
Journal:  J RNAi Gene Silencing       Date:  2007-08-15
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.