| Literature DB >> 21143784 |
Liangjiang Wang1, Caiyan Huang, Jack Y Yang.
Abstract
BACKGROUND: Short interfering RNAs (siRNAs) can be used to knockdown gene expression in functional genomics. For a target gene of interest, many siRNA molecules may be designed, whereas their efficiency of expression inhibition often varies.Entities:
Mesh:
Substances:
Year: 2010 PMID: 21143784 PMCID: PMC2999347 DOI: 10.1186/1471-2164-11-S3-S2
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Potential sequence features for siRNA classification.
| Feature group | Number of features |
|---|---|
| siRNA nucleotide sequence | 19 |
| Single-nucleotide frequencies | 4 |
| Dinucleotide frequencies | 16 |
| Trinucleotide frequencies | 64 |
| Global and local G/C contents | 16 |
| Secondary structure stability | 1 |
Important features selected by random forests.
| Feature | #RFs | Raw score | Z-score | Correlation |
|---|---|---|---|---|
| UCC% | 10 | 2.461 | 14.909 | 0.294 |
| CAG% | 10 | 1.514 | 11.849 | -0.289 |
| GAG% | 10 | 1.652 | 11.674 | -0.305 |
| UC% | 10 | 1.988 | 11.255 | 0.281 |
| GCA% | 10 | 1.140 | 11.191 | -0.265 |
| G% | 10 | 1.672 | 9.483 | -0.266 |
| CG% | 10 | 1.235 | 8.460 | 0.133 |
| AUA% | 10 | 0.524 | 8.148 | -0.166 |
| AAG% | 9 | 0.848 | 7.851 | 0.102 |
| CUG% | 10 | 0.918 | 7.240 | -0.173 |
| U% | 9 | 1.201 | 7.170 | 0.127 |
| G/C% (first 5 bases) | 10 | 1.075 | 7.116 | -0.256 |
| AUC% | 8 | 0.632 | 6.565 | 0.201 |
| AG% | 8 | 0.910 | 6.557 | -0.277 |
| GG% | 9 | 0.831 | 6.478 | -0.190 |
| GCG% | 6 | 0.554 | 6.422 | 0.059 |
| G/C% (overall) | 5 | 0.959 | 6.414 | -0.147 |
| GGA% | 7 | 0.717 | 6.326 | -0.218 |
| AAC% | 10 | 0.409 | 6.326 | 0.162 |
| UUU% | 9 | 0.714 | 6.317 | 0.108 |
| GGC% | 9 | 0.595 | 6.304 | -0.134 |
| NT3 (C) | 5 | 0.901 | 6.258 | -0.199 |
| ACA% | 8 | 0.473 | 6.218 | 0.092 |
| UUC% | 7 | 0.542 | 5.897 | 0.125 |
| CC% | 7 | 0.704 | 5.807 | 0.004 |
| CAA% | 6 | 0.432 | 5.602 | 0.129 |
Performance of support vector machine classifiers.
| Classifier | Accuracy (%) | Sensitivity (%) | Specificity (%) | MCC | AUC |
|---|---|---|---|---|---|
| RF_Features | 70.71 | 73.94 | 66.09 | 0.3983 | 0.7529 |
| All_Features | 68.93 | 76.97 | 57.39 | 0.3499 | 0.7372 |
| Seq_Features | 65.36 | 68.48 | 60.87 | 0.2912 | 0.6624 |
Figure 1Classifier performance evaluation using ROC curves.
Figure 2Correlation of SVM output with siRNA efficacy. The true positive (TP) and true negative (TN) predictions are shown in red circles, whereas the false positive (FP) and false negative (FN) predictions are shown in green triangles.