| Literature DB >> 20444868 |
Thomas Horn1, Michael Boutros.
Abstract
The design of RNA interference (RNAi) reagents is an essential step for performing loss-of-function studies in many experimental systems. The availability of sequenced and annotated genomes greatly facilitates RNAi experiments in an increasing number of organisms that were previously not genetically tractable. The E-RNAi web-service, accessible at http://www.e-rnai.org/, provides a computational resource for the optimized design and evaluation of RNAi reagents. The 2010 update of E-RNAi now covers 12 genomes, including Drosophila, Caenorhabditis elegans, human, emerging model organisms such as Schmidtea mediterranea and Acyrthosiphon pisum, as well as the medically relevant vectors Anopheles gambiae and Aedes aegypti. The web service calculates RNAi reagents based on the input of target sequences, sequence identifiers or by visual selection of target regions through a genome browser interface. It identifies optimized RNAi target-sites by ranking sequences according to their predicted specificity, efficiency and complexity. E-RNAi also facilitates the design of secondary RNAi reagents for validation experiments, evaluation of pooled siRNA reagents and batch design. Results are presented online, as a downloadable HTML report and as tab-delimited files.Entities:
Mesh:
Substances:
Year: 2010 PMID: 20444868 PMCID: PMC2896145 DOI: 10.1093/nar/gkq317
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Organisms available in E-RNAi version 3
| Organism | Database | Available query identifiers |
|---|---|---|
| FlyBase | FBgn, CG identifiers and gene symbols (e.g. CG11992, FBgn0014018 or rel) | |
| AphidBase | ACYPI, XM and LOC identifiers (from ‘official gene consensus set’, e.g. ACYPI006699, XM_001944230 or LOC100165774) | |
| NCBI RefSeq | Gene names, Entrez gene identifiers and RefSeq identifiers (e.g. RPS11, 6205 or NM_001015.3) | |
| SmedGD | MAKER identifiers (mk4 from SmedGD, e.g. mk4.023206.00) | |
| EnsemblFungi | Ensembl gene identifiers (e.g. YHR055C) | |
| EnsemblFungi | Ensembl gene identifiers (e.g. SPMIT.05) | |
| WormBase | WBGene identifiers, sequence names and gene names (e.g. WBGene00006763, JC8.10 or unc-26) | |
| VectorBase | AGAP gene identifiers (e.g. AGAP000009) | |
| VectorBase | AAEL gene identifiers (e.g. AAEL000068) | |
| NCBI RefSeq | Gene names, Entrez gene identifiers and RefSeq identifiers (e.g. Axin1, 12005 or NM_009733.2) | |
| BeeBase | GB gene identifiers (e.g. GB15421) | |
| BeetleBase | TC and GLEAN-prediction gene identifiers (e.g. TC004684 or GLEAN_04684) |
Figure 1.Structure of the E-RNAi web service and database back end. The left side (grey panels) shows the sequences and mapping information required from public databases to populate the E-RNAi sequence- and GBrowse-databases and to generate the indices needed for Bowtie, Blat and Blast alignment tools. The right side shows the workflow of the E-RNAi web server from the selection of sequences and adjustment of options (red and yellow panels) over the processing steps performed during the design of reagents until the generation of a HTML report (green panel). Further details are explained in the text.
Figure 2.Sequence selection and design options. (a) After querying E-RNAi with gene identifiers, gene sequences, transcript sequences or exon sequences can be selected for design. (b) Individual sequences can be selected as design templates visually from GBrowse (upper panel). After sequence submission to E-RNAi, the type of reagent (long dsRNA or siRNA) can be selected (lower panel). (c) Options available for the design of long dsRNAs include settings for the prediction of specificity, efficiency and low-complexity regions (left panel) as well as primer design settings (upper right panel). In addition different output options, such as the number of designs per query sequence and different report formats, can be selected. The sequence of a previous design can also be uploaded in FASTA format to exclude it from a new (independent) design (lower right panel). Details for certain options are provided in the text and listed in Table 2.
Long dsRNA / siRNA design options in E-RNAi version 3
| Parameter | Option | Reagent | Description | Default |
|---|---|---|---|---|
| siRNA length for specificity prediction | Design / evaluation | Long dsRNA / siRNA | siRNA length for | 19 bp |
| Exclude regions of low sequence complexity | Design / evaluation | Long dsRNA / siRNA | Filter (‘mdust’) to avoid regions of low complexity, such as simple nucleotide repeats | Yes |
| Exclude >5× CA[ACGT] repeats | Design / evaluation | Long dsRNA / siRNA | Filter to avoid longer stretches of CAN repeats | Yes |
| Efficiency scoring method | Design / evaluation | Long dsRNA / siRNA | Method ‘weighted’ [Shah | Weighted |
| Minimal siRNA efficiency score | Design / evaluation | Long dsRNA / siRNA | Efficiency score filter for ‘diced’ siRNAs, values between 0 and 100 | Long dsRNAs: 20, siRNAs: 70 |
| siRNA seed-match analysis | Design / evaluation | Long dsRNA / siRNA | siRNA seed-matches to a user-defined sequence database (FASTA) are calculated | Disabled |
| Maximum overlap with introns allowed | Design | Long dsRNA | Adjust the maximal allowed overlap of designs to intron-regions (in percent of length of design) | 25% |
| Allow relaxation of parameters if required | Design / evaluation | Long dsRNA / siRNA | Allows to design reagents even if specificity, efficiency and low-complexity criteria are not met by the input sequence | Yes |
| Map designs to the genome | Design / evaluation | Long dsRNA / siRNA | Mapping of calculated designs to the genome (required for GBrowse visualization) | Yes |
| Calculate homology of designs to transcripts | Design / evaluation | Long dsRNA / siRNA | Mapping of calculated designs to the trancriptome using Blast for homology evaluation (below the defined | Yes (long dsRNAs: 1 |
| Reagent source type | Evaluation | Long dsRNA / siRNA | Source (‘genomic’ or ‘CDS’) the reagent was designed and synthesized with | Genomic |
| Primer design options | Design / evaluation | Long dsRNAs | Define optimal/minimal/maximal primer size, amplicon size range, primer pair penalty cutoff, number of primer designs per ‘favorable’ region | Opt: 20 bp, Min: 18 bp, Max: 27 bp, Range:150–500 bp, Penalty: 4, Designs: 50 |
| Add promoter sequence or tag | Design | Long dsRNAs | Promoter sequence (for | None |
| Number of designs reported per query sequence | Design | Long dsRNA / siRNA | Number of best long dsRNAs / siRNAs for each input sequence reported in the output | 1 |
| GFF file output | Design / evaluation | Long dsRNA / siRNA | Output of generic feature file for GBrowse database upload and documentation of design location | No |
| AFF file output | Design / evaluation | Long dsRNA / siRNA | Output of annotation file format for direct upload and visualization of designs in any GBrowse installation | No |
| Design of independent RNAi reagents | Design | Long dsRNA / siRNA | Sequences of previous designs uploaded as FASTA file will be excluded from new designs, can also be used to exclude other sequences (e.g. miRNA sites) | No |
| Analysis of siRNA pools | Evaluation | siRNA | Summary report for siRNA pools, if a tab-delimited file with ‘siRNAID’ and ‘POOLID’ columns was provided | No |
aDetailed descriptions including examples are available in the E-RNAi Wiki at: http://e-rnai.org/wiki.
Figure 3.Report for the design of a long dsRNA against the gene twi. ‘dsRNA information’ summarizes the properties of the primers required to amplify the reagent from genomic or cDNA sources and also shows the full sequence of the construct, its length and location in the genome. ‘Target information’ lists all ‘intended’ and ‘other’ target genes found as well as all transcripts belonging to them (including the number of siRNA ‘hits’ to each transcript). ‘Reagent quality’ summarizes the analyzed quality parameters and shows the number of specific (‘on-target’) siRNAs contained within the dsRNA, the number of ‘off-target’ and ‘no-target’ siRNAs as well as the number of ‘efficient siRNAs’, the average siRNA efficiency and low-complexity information. In addition, all targets with significant ‘sequence homology’ to the long dsRNA are listed. GBrowse visualizes the location of the designed reagent in the gene model.