| Literature DB >> 22046244 |
Mohamed Mysara1, Jonathan M Garibaldi, Mahmoud Elhefnawi.
Abstract
The design of small interfering RNA (siRNA) is a multi factorial problem that has gained the attention of many researchers in the area of therapeutic and functional genomics. MysiRNA score was previously introduced that improves the correlation of siRNA activity prediction considering state of the art algorithms. In this paper, a new program, MysiRNA-Designer, is described which integrates several factors in an automated work-flow considering mRNA transcripts variations, siRNA and mRNA target accessibility, and both near-perfect and partial off-target matches. It also features the MysiRNA score, a highly ranked correlated siRNA efficacy prediction score for ranking the designed siRNAs, in addition to top scoring models Biopredsi, DISR, Thermocomposition21 and i-Score, and integrates them in a unique siRNA score-filtration technique. This multi-score filtration layer filters siRNA that passes the 90% thresholds calculated from experimental dataset features. MysiRNA-Designer takes an accession, finds conserved regions among its transcript space, finds accessible regions within the mRNA, designs all possible siRNAs for these regions, filters them based on multi-scores thresholds, and then performs SNP and off-target filtration. These strict selection criteria were tested against human genes in which at least one active siRNA was designed from 95.7% of total genes. In addition, when tested against an experimental dataset, MysiRNA-Designer was found capable of rejecting 98% of the false positive siRNAs, showing superiority over three state of the art siRNA design programs. MysiRNA is a freely accessible (Microsoft Windows based) desktop application that can be used to design siRNA with a high accuracy and specificity. We believe that MysiRNA-Designer has the potential to play an important role in this area.Entities:
Mesh:
Substances:
Year: 2011 PMID: 22046244 PMCID: PMC3202522 DOI: 10.1371/journal.pone.0025642
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Different phases for designing siRNA with high efficiency & sensitivity.
There are seven distinguished phases for siRNA design: 1st choosing the targeted gene for silencing. 2nd identifying the proper target sequence space that represent all gene's transcripts and doesn't have any SNPs. 3rd designing all possible siRNA with nineteen nucleotides length with both sense and antisense strand. 4th these potential siRNAs are scored and evaluated according to several scoring mechanisms and criteria and then filter them according to produced scores. 5th siRNA are filtered according to target accessibility. 6th off-target filtration of the remaining siRNA is performed excluding siRNAs with unwanted off-target effect. 7th select the best designed siRNAs that passes all the previous filtration phases and achieve the highest predicted efficiency.
Figure 2Different preprocessing steps in order to identify the representative sequence space within the mRNA.
Sequence space should be free from unstable regions (black color) and SNPs (green color) occurrence, which is conserved among different gene transcripts (red color) which are later, used as a template for siRNA design.
Figure 3Decision flow of targeted gene's multi-transcript filtration phase implemented in MysiRNA-Designer.
MysiRNA-Designer first check whether the mRNA entered has any other transcripts, if such cases, I get the other transcripts using NCBI blast, and perform multiple sequence alignment to these sequences. The un-gapped consensus is later calculated in order to Design siRNA targeting the desired sequence space.
Assigned threshhold scores using the Huesken dataset to analyse each scoring tool to two thresholds to filter siRNA with expected inhibition efficiency 90%.
| Min score | Min Threshold | Mean | Max Threshold | Max score | Standard Deviation | |
|
| 0 | 1.9 | 5.52 | 9.15 | 10 | 1.81 |
|
| III | III | Ib | Ia | Ia | 0.84 |
|
| −2 | −1.21 | 2.04 | 5.3 | 5 | 1.62 |
|
| 31 | 42.03 | 69.52 | 97.01 | 103.9 | 13.7 |
|
| −2 | −1.11 | 1 | 3.11 | 4 | 1.05 |
|
| −11 | −10.22 | 1.92 | 14.06 | 20.2 | 6.07 |
First, siRNA with inhibition efficiency above 90% are isolated from the dataset. Then for each scoring tool, the mean and Standard deviation is calculated and the minimum and maximum thresholds are assigned by deviation from the mean by two folds of standard deviation.
Figure 4Off-target filtration workflow describing decision making process for siRNAs off-target filtration.
Initially, MysiRNA checks the existence of off-target for each siRNA, using mRNA reference sequences. In case where off-target has been found, it check whether it is a complete homology (with one or two mismatch), where it is be rejected. In cases free from complete homology, it s check the existence of seed Homology, where the siRNA seed region (2nd to 3rd nts) matches with the off-targeted mRNA 3′UTR. If the siRNA free from both complete homology and seed matching homology it is considered as off-target free and hence pass this filtration step.
Figure 5Flow chart of “MysiRNA-Designer” program.
MysiRNA-Designer takes an accession; get the mRNA sequence from NCBI-GenBank. Finds out if this mRNA has other transcript(s), performs multiple sequence alignment with the transcripts, if any, and takes the consensus un-gapped sequence, designs all possible siRNA in targeting sequence space available; performs target accessibility evaluation selection siRNA with energetically and structurally favored siRNA-mRNA binding. Predict siRNA efficiency using the implemented multi-score filtration; select the candidates that pass the threshold assigned for each of the ten tools used, eliminates siRNA targeting SNPs regions or off-targeted mRNA, either complete or seed homology off-target. Finally, MysiRNA-Designer shows the accepted candidates with the predicted silencing efficiency using MysiRNA-Model, it filters candidates above the assigned threshold, as the user requires.
Evaluation of the specificity and sensitivity of different models compared to MysiRNA-Designer two modes (Intersections of different scoring modelsand MysiRNA model 93% on the Fellmann experimental dataset.
| Ui-Tei | Amar | Hsieh | Taka | Biopredsi | i-Score | Rey | Katoh | DSIR | Thermo21 | Multi-Scores | MysiRNA-Model 93% | |
|
| 0.99 | 0.97 | 1.00 | 1.00 | 0.73 | 0.32 | 1.00 | 0.68 | 0.85 | 0.84 | 0.30 | 0.22 |
|
| 0.13 | 0.13 | 0.01 | 0.01 | 0.69 | 0.92 | 0.01 | 0.68 | 0.54 | 0.55 | 0.93 | 0.97 |
|
| 236 | 232 | 238 | 237 | 173 | 72 | 119 | 161 | 203 | 199 | 69 | 24 |
|
| 2 | 6 | 0 | 1 | 65 | 166 | 119 | 77 | 35 | 39 | 169 | 214 |
|
| 2476 | 2421 | 268 | 218 | 12605 | 17203 | 15213 | 12506 | 9955 | 10008 | 17315 | 17820 |
|
| 15879 | 15934 | 18087 | 18137 | 5750 | 1152 | 3142 | 5849 | 8400 | 8347 | 1040 | 535 |
The combination of multiple scoring tools rather than single one, in our designed multi-scores filtration stage perform with enhanced efficiency when compared against experimental data results [39]. This study involved tools such as: Ui-Tei [6], Amarzguioui [3], Hsieh [7], Takasaki [4], Biopredsi [9], i-Score [8], Reynolds [2], Katoh [5], DSIR [11] and ThermoComposition21 [10], in order as shown in the table. As our aim to reject as much false positive (FP) as possible, the intersection between tools provided solid, more reliable results with specificity up to 93%. In addition, we used MysiRNA-Model, an Artificial Neural Network model for siRNA scoring and efficiency prediction, via assigning a threshold of 93% above it siRNA candidates were considered accepted. This modification was integrated with our multi-score filtration algorithm and was able to boost the specificity up to 97% [see supplementary data].
TP = true positives, FN = false negatives, TN = true negatives, FP = false negatives.
Comparison between MysiRNA-Designer and several programs used for siRNA full automation designing.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
This Comparison involves tools ability to perform alignment between different transcripts, conserved regions consideration. All together with siRNA candidate evaluation using several algorithms and target accessibility. siRNAs iltration by the presence of Single Nucleotide Polymorphisms and off-targets (both full homology and seed regions).
*1 http://www.dharmacon.com/designcenter/DesignCenterPage.aspx.
*2 http://sysbio.kribb.re.kr:8080/AsiDesigner/menuDesigner.jsf.
*3 http://rna.tbi.univie.ac.at/cgi-bin/RNAxs.
Illustration of the Comparative analysis results between MysiRNA-Designer, AsiDesigner, siDesign and RNAxs against an experimentally verified dataset.
|
|
|
|
|
| |
|
| 0.13 | 0.18 | 0.50 | 0.19 | 0.14 |
|
| 0.95 | 0.94 | 0.76 | 0.96 | 0.98 |
|
| 31 | 42 | 117 | 44 | 33 |
|
| 201 | 190 | 115 | 188 | 199 |
|
| 17657 | 17409 | 14068 | 17843 | 18090 |
|
| 813 | 1061 | 4402 | 627 | 380 |
Using the experimentally verified dataset, published in [39], a comparative analysis involving MysiRNA-Designer and three of the top siRNA design programs, that preform whole automation process. We used both MysiRNA-Designer options either with or without the implementation of MysiRNA-Model threshold. The result of this study demonistrate the superiority of MysiRNA-Designer, in either options, in rejecting as much false positive as possible, reflecting the high spicificity desired.
TP = true positives, FN = false negatives, TN = true negatives, FP = false negatives.