| Literature DB >> 21853133 |
Firoz Ahmed1, Gajendra P S Raghava.
Abstract
In past, numerous methods have been developed for predicting efficacy of short interfering RNA (siRNA). However these methods have been developed for predicting efficacy of fully complementary siRNA against a gene. Best of author's knowledge no method has been developed for predicting efficacy of mismatch siRNA against a gene. In this study, a systematic attempt has been made to identify highly effective complementary as well as mismatch siRNAs for silencing a gene.Support vector machine (SVM) based models have been developed for predicting efficacy of siRNAs using composition, binary and hybrid pattern siRNAs. We achieved maximum correlation 0.67 between predicted and actual efficacy of siRNAs using hybrid model. All models were trained and tested on a dataset of 2182 siRNAs and performance was evaluated using five-fold cross validation techniques. The performance of our method desiRm is comparable to other well-known methods. In this study, first time attempt has been made to design mutant siRNAs (mismatch siRNAs). In this approach we mutated a given siRNA on all possible sites/positions with all possible nucleotides. Efficacy of each mutated siRNA is predicted using our method desiRm. It is well known from literature that mismatches between siRNA and target affects the silencing efficacy. Thus we have incorporated the rules derived from base mismatches experimental data to find out over all efficacy of mutated or mismatch siRNAs. Finally we developed a webserver, desiRm (http://www.imtech.res.in/raghava/desirm/) for designing highly effective siRNA for silencing a gene. This tool will be helpful to design siRNA to degrade disease isoform of heterozygous single nucleotide polymorphism gene without depleting the wild type protein.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21853133 PMCID: PMC3154470 DOI: 10.1371/journal.pone.0023443
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Performance of SVM-based models for siRNA efficacy prediction developed using composition based features.
| Composition | Features | Vector | R | R2 | MAE | RMSE | g | c | j |
|
| Mono | 4 | 0.316 | 0.095 | 0.152 | 0.190 | 0.001 | 1 | 1 |
| Di | 16 | 0.450 | 0.145 | 0.145 | 0.185 | 0.001 | 3 | 2 | |
| Tri | 64 | 0.515 | 0.248 | 0.138 | 0.173 | 0.001 | 1 | 2 | |
| Tetra | 256 | 0.574 | 0.312 | 0.131 | 0.166 | 0.0001 | 10 | 2 | |
|
| Mono | 8 | 0.355 | -0.03 | 0.161 | 0.203 | 0.001 | 1 | 3 |
| Di | 32 | 0.453 | 0.203 | 0.143 | 0.178 | 0.001 | 1 | 3 | |
| Tri | 128 | 0.508 | 0.243 | 0.137 | 0.174 | 0.0001 | 2 | 2 | |
|
| 2nd order Di | 16 | 0.420 | 0.115 | 0.149 | 0.188 | 0.001 | 1 | 2 |
| 3rd order Di | 16 | 0.467 | 0.207 | 0.143 | 0.178 | 0.001 | 1 | 1 | |
| 4th order Di | 16 | 0.461 | 0.150 | 0.146 | 0.184 | 0.001 | 1 | 2 | |
|
|
|
|
|
|
|
|
|
| |
| 3rd order Tri | 64 | 0.483 | 0.218 | 0.141 | 0.177 | 0.001 | 1 | 1 | |
| 2nd order Tetra | 256 | 0.502 | 0.222 | 0.139 | 0.176 | 0.0001 | 10 | 2 |
Mono: mononucleotide; di: dinucleotide; tri: trinucleotide; tetra: tetranucleotide; R: correlation coefficiet; R2: Coefficient of determination; MAE: Mean absolute error; RMSE: Root mean square of error; g, c, and j are SVM parameters.
Performance of SVM-based model for siRNA efficacy prediction developed using position specific feature and our method desiRm.
| Features | Vector | R | R2 | MAE | RMSE | g | c | j |
| Binary pattern | 84 | 0.637 | 0.406 | 0.122 | 0.154 | 0.01 | 1 | 1 |
| Binary of di | 320 | 0.563 | 0.272 | 0.135 | 0.170 | 0.001 | 6 | 2 |
| Binary of Condense | 40 | 0.449 | 0.200 | 0.142 | 0.179 | 0.001 | 10 | 1 |
| AU, GC | 42 | 0.362 | 0.130 | 0.149 | 0.186 | 0.001 | 1 | 1 |
| Hydrogen bond | 21 | 0.579 | 0.335 | 0.130 | 0.163 | 0.01 | 2 | 1 |
| Thermodynamics | 19 | 0.577 | 0.332 | 0.129 | 0.163 | 0.001 | 10 | 1 |
|
|
|
|
|
|
|
|
|
|
Performance of desiRm21 and other four algorithms on test dataset containing 419 siRNA.
| Methods | R | R2 | MAE | RMSE |
| i-Score | 0.557 | 0.217 | 0.243 | 0.284 |
| s-Biopredsi | 0.546 | 0.296 | 0.218 | 0.270 |
| Thermocomposition21 | 0.577 | 0.200 | 0.221 | 0.288 |
| DSIR | 0.555 | 0.158 | 0.222 | 0.295 |
| desiRm21 | 0.558 | 0.164 | 0.222 | 0.294 |
Figure 1Schematic diagram of efficacy of complementary and mismatch siRNAs against a target site.
Fully complementary siRNA has actual and predicted efficacy of 0.479 and 0.588 respectively. Single mutation at 1st position in the siRNA has predicted efficacy of 0.776 but overall efficacy due to single mismatch is 0.710 (0.776-0.066). Further mutation at 15th position in siRNA has predicted mismatch efficacy of 0.929. Base pairing is denoted by “ | ”, mismatch with “ : ”, and mutant base with small case.
Comparative study of increase/decrease efficacy of siRNAs by using our method, desiRm.
| siRNA antisense | Target access | Actual Efficacy | desiRm Efficacy | Mutated siRNA antisense | desiRm Efficacy | Position of Mutation |
| UCCUCACCAUCCGUCCAGU | 0.003895 | 0.465 | 0.577 | UCCUCACCcUCCGUCCAGg | 0.771 | 9, 19 |
| CUAAUAUGUUAAUUGAUUU | 0.054683 | 0.462 | 0.647 | CUAAUAUGUUAAUUGAUUg | 0.813 | 19 |
| CUAAUAUcUUAAUUGAUUg | 0.855 | 8,19 | ||||
| uUAAUAUGUUAAUUGAUUg | 0.909 | 1,19 | ||||
| CAGAUUCCACACCAUGUGG | 0.000327 | 0.402 | 0.732 | uAGAUUCCACACCAUGUGG | 0.864 | 1 |
| aAGAUUCCACACCAUGUGG | 0.923 | 1 | ||||
| uAGAUUCCACACCAaGUGG | 1.033 | 1,15 | ||||
| uAGAUUCCACACCAcGUGG | 0.148 | 1, 15 | ||||
| uAGAUcCCACACCAUGUGG | 0.061 | 1,6 | ||||
| GGUCCACAUUCUAUUUUAA | 0.007570 | 0.388 | 0.397 | aGUCCACAUUCUAUUUUAA | 0.628 | 1 |
| uGUCCACAUUCUAUUUUAg | 0.798 | 1, 19 | ||||
| uGUCCACAUUCUAUUUUcg | 0.757 | 1, 18, 19 | ||||
| CCUCACCAUCCGUCCAGUA | 0.002853 | 0.326 | 0.473 | aCUCACCAUCCGUCCAGUA | 0.653 | 1 |
| uCUCACCAUCCGUCCAGUg | 0.760 | 1, 19 | ||||
| UGUCUACAAUCCACUGUGU | 0.008437 | 0.993 | 0.878 | UGUCUACAAaCCACUGUGU | 0.188 | 10 |
| UGUCUACAuUCCACUGUGU | 0.038 | 9 | ||||
| AACUUCUUGGCUUUGUACU | 0.023926 | 0.995 | 0.895 | AACUUCUUGuCUUUGUACU | 0.228 | 10 |
| AACAGCUCCGGAUUCUGUG | 0.000321 | 0.978 | 0.926 | AACAGCUCCGGAUaCUGUG | 0.273 | 14 |
| AACAGCUCCcGAUUCUGUG | 0.260 | 10 | ||||
| AACAGCUCCGGAUUaUGUG | 0.189 | 15 | ||||
| UAGAAAUGCACACAUCACC | 0.001601 | 0.947 | 1.019 | UAGAAAUGCACAaAUCACC | 0.343 | 13 |
| AAAACUUCACUACAAAUUC | 0.008497 | 0.967 | 0.914 | AAAACUUCuCUACAAAUUC | 0.083 | 9 |
| AAAACUUCAaUACAAAUUC | 0.027 | 10 |
Sequence taken from Huesken data, mutated nucleotide is denotes in lower case. Target access: probability of being unpaired at target site calculated by RNAplfold.
Assessment of desiRm on experimentally verified mismatched siRNAs of siPrnp102(T9).
| Name of siRNA | siRNA sequence (antisense) Mutated sequence | # Mismatch (mRNA) | Target sequence | siRNA:Target (base mismatch position on siRNA) | Actual Efficacy | Predicted efficacy |
| siPrnp102(T9) | UGGCUUACUCAGCUUGUUC | 0 (mutant) | GAACAAGCUGAGUAAGCCA | 0 | 0.972 | 0.942 |
| siPrnp102(T9)-5U | UGGCUUACUCAGCU | 1(mutant) | GAAC | A:A(15) | 0.953 | 0.199 |
| siPrnp102(T9)-6U | UGGCUUACUCAGC | 1(mutant) | GAACA | A:A(14) | 0.864 | 0.267 |
| siPrnp102(T9)-7C | UGGCUUACUCAG | 1(mutant) | GAACAA | G:G(13) | 0.867 | 0.531 |
| siPrnp102(T9)-12C | UGGCUUA | 1(mutant) | GAACAAGCUGA | G:G(8) | 0.931 | 0.645 |
| siPrnp102(T9)-13A | UGGCUU | 1(mutant) | GAACAAGCUGAG | U:U(7) | 0.951 | 0.821 |
| siPrnp102(T9)-14U | UGGCU | 1(mutant) | GAACAAGCUGAGU | A:A(6) | 0.949 | 0.571 |
| siPrnp102(T9)-15U | UGGC | 1(mutant) | GAACAAGCUGAGUA | A:A(5) | 0.964 | 0.720 |
| siPrnp102(T9)-16C | UGG | 1(mutant) | GAACAAGCUGAGUAA | G:G(4) | 0.850 | 0.664 |
| siPrnp102(T9)-17G | UG | 1(mutant) | GAACAAGCUGAGUAAG | C:C(3) | 0.941 | 0.782 |
| siPrnp102(T9) | UGGCUUACUC | 1 (wt) | GAACAAGC | A:G (11) | 0.763 | 0.450 |
| siPrnp102(T9)-5U | UGGCUUACUC | 2 (wt) | GAAC | A:A(15)/A:G (11) | 0.513 | 0.150 |
| siPrnp102(T9)-6U | UGGCUUACUC | 2 (wt) | GAACA | A:A(14)/A:G (11) | 0.403 | 0.134 |
| siPrnp102(T9)-7C | UGGCUUACUC | 2 (wt) | GAACAA | G:G(13)/A:G (11) | 0.400 | 0.033 |
| siPrnp102(T9)-12C | UGGCUUA | 2 (wt) | GAACAAGC | G:G(8)/A:G (11) | -0.041 | 0.143 |
| siPrnp102(T9)-13A | UGGCUU | 2 (wt) | GAACAAGC | U:U(7)/A:G (11) | 0.183 | 0.286 |
| siPrnp102(T9)-14U | UGGCU | 2 (wt) | GAACAAGC | A:A(6)/A:G (11) | -0.135 | 0.176 |
| siPrnp102(T9)-15U | UGGC | 2 (wt) | GAACAAGC | A:A(5)/A:G (11) | 0.388 | 0.217 |
| siPrnp102(T9)-16C | UGG | 2 (wt) | GAACAAGC | G:G(4)/A:G (11) | 0.126 | 0.265 |
| siPrnp102(T9)-17G | UG | 2 (wt) | GAACAAGC | C:C(3)/A:G (11) | -0.063 | 0.178 |
siPrnp102(T9) and its various mutant siRNAs were targeted against prion protein genes (PRNP) and its mutant allele (PRNP-P102L). Mutated base in siRNA is denoted by small letter while mismatch base between siRNA and target are denoted by bold letter. Data of actual efficacy of siRNAs were taken from experimental work reported by Ohnishi et al [36]. Predicted efficacy denotes efficacy of desiRm. All sequences are in 5′ to 3′ direction. Correlation coefficient between actual and predicted efficacy is R = 0.725.