| Literature DB >> 22102913 |
Wei-Jie Pan1, Chi-Wei Chen, Yen-Wei Chu.
Abstract
Small interfering RNA (siRNA) has been used widely to induce gene silencing in cells. To predict the efficacy of an siRNA with respect to inhibition of its target mRNA, we developed a two layer system, siPRED, which is based on various characteristic methods in the first layer and fusion mechanisms in the second layer. Characteristic methods were constructed by support vector regression from three categories of characteristics, namely sequence, features, and rules. Fusion mechanisms considered combinations of characteristic methods in different categories and were implemented by support vector regression and neural networks to yield integrated methods. In siPRED, the prediction of siRNA efficacy through integrated methods was better than through any method that utilized only a single method. Moreover, the weighting of each characteristic method in the context of integrated methods was established by genetic algorithms so that the effect of each characteristic method could be revealed. Using a validation dataset, siPRED performed better than other predictive systems that used the scoring method, neural networks, or linear regression. Finally, siPRED can be improved to achieve a correlation coefficient of 0.777 when the threshold of the whole stacking energy is ≥-34.6 kcal/mol. siPRED is freely available on the web at http://predictor.nchu.edu.tw/siPRED.Entities:
Mesh:
Substances:
Year: 2011 PMID: 22102913 PMCID: PMC3213166 DOI: 10.1371/journal.pone.0027602
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
The thresholds of each feature characteristic method.
| Method | Ssingle | Sn-gram | Number of features |
| F162 | unrestricted | unrestricted | 162 |
| F85 |
|
| 85 |
| F65 |
|
| 65 |
| F47 |
|
| 47 |
r a The absolute value of correlation coefficients between feature elements and siRNA efficacy. All feature elements in Sthermodynamic are considered.
Pearson correlation coefficient of each characteristic method trained with dataset A.
| Method |
|
| Numeric | 0.514 |
| Binary | 0.613 |
| Hybrid | 0.612 |
| F162 | 0.602 |
| F85 | 0.634 |
| F65 | 0.627 |
| F47 | 0.615 |
| R12 | 0.569 |
Pearson correlation coefficient of each integrated characteristic method.
| Mechanism | Integrated method |
|
|
| SVR | Binary + F162 | 0.756 | 0.534 |
| Binary + F85 | 0.696 | 0.563 | |
| Binary + F65 | 0.688 | 0.564 | |
| Binary + F47 | 0.679 | 0.543 | |
| Hybrid + F162 | 0.773 | 0.534 | |
| Hybrid + F85 | 0.686 | 0.577 | |
| Hybrid + F65 | 0.678 | 0.588 | |
| Hybrid + F47 | 0.670 | 0.541 | |
| Neural network | Binary + F162 | 0.783 | 0.566 |
| Binary + F85 | 0.691 | 0.579 | |
| Binary + F65 | 0.686 | 0.585 | |
| Binary + F47 | 0.680 | 0.580 | |
| Hybrid + F162 | 0.784 | 0.562 | |
| Hybrid + F85 | 0.685 | 0.579 | |
| Hybrid + F65 | 0.678 | 0.586 | |
| Hybrid + F47 | 0.670 | 0.580 |
Pearson correlation coefficient of integrated methods trained with dataset A.
Pearson correlation coefficient of integrated methods validated with dataset B.
Neural network has an input layer of two nodes, a hidden layer of six nodes, and an output layer of one node.
Determining the weights in the integrated method of Hybrid+F65 by genetic algorithms.
| Generation |
|
|
|
|
| 100 | 0.179 | 0.814 | 0.670 | 0.569 |
| 500 | 0.515 | 0.480 | 0.669 | 0.567 |
| 1000 | 0.365 | 0.631 | 0.671 | 0.570 |
| 2000 | 0.359 | 0.637 | 0.671 | 0.570 |
W belongs to the characteristic method of Hybrid, and W belongs to the characteristic method of F65 in the integrated method. r A is the correlation coefficient for Hybrid+F65 trained with dataset A, and r B was validated with dataset B. Additionally, for the genetic algorithms, the population was 100 and the rates of one-point crossover and mutation were 0.7 and 0.001, respectively.
Figure 1The distribution between observed and predicted siRNA efficacy using dataset B by (A) siPRED, (B) ThermoComposition21, (C) DSIR, (D) s-Biopredsi, and (E) i-Score. ‘r’ represents the Pearson correlation coefficient.
Performance of each system for predicting siRNA efficacy (i.e., ≥70% inhibition of the target mRNA).
| Predictive system | Acc (%) | Sn (%) | Sp (%) | MCC |
|
| 75.66 | 83.10 | 67.96 | 0.517 |
|
| 55.61 | 15.49 | 97.09 | 0.216 |
|
| 67.30 | 55.40 | 79.61 | 0.360 |
|
| 70.41 | 72.77 | 67.96 | 0.407 |
|
| 70.64 | 74.18 | 66.99 | 0.412 |
Effect of the whole ΔG for each system on prediction accuracy.
| Predictive system |
|
|
|
| 0.777 | 0.538 |
|
| 0.723 | 0.514 |
|
| 0.724 | 0.498 |
|
| 0.677 | 0.551 |
|
| 0.733 | 0.499 |
Pearson correlation coefficient of siRNAs (total of 101) having a ΔG threshold ≥−34.6 kcal/mol with dataset B.
Pearson correlation coefficient of siRNAs (total of 318) having a ΔG threshold <−34.6 kcal/mol with dataset B.