| Literature DB >> 22192482 |
Usha K Muppirala1, Vasant G Honavar, Drena Dobbs.
Abstract
BACKGROUND: RNA-protein interactions (RPIs) play important roles in a wide variety of cellular processes, ranging from transcriptional and post-transcriptional regulation of gene expression to host defense against pathogens. High throughput experiments to identify RNA-protein interactions are beginning to provide valuable information about the complexity of RNA-protein interaction networks, but are expensive and time consuming. Hence, there is a need for reliable computational methods for predicting RNA-protein interactions.Entities:
Mesh:
Substances:
Year: 2011 PMID: 22192482 PMCID: PMC3322362 DOI: 10.1186/1471-2105-12-489
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Performance evaluation of RPISeq
| Dataset | Classifier | Accuracy % | Precision | Recall | F-measure |
|---|---|---|---|---|---|
| RPI2241 | Random Forest | 89.6 | 0.89 | 0.90 | 0.90 |
| RPI2241 | SVM | 87.1 | 0.87 | 0.88 | 0.87 |
| RPI369 | Random Forest | 76.2 | 0.75 | 0.78 | 0.77 |
| RPI369 | SVM | 72.8 | 0.73 | 0.73 | 0.73 |
Results of 10-fold cross-validation experiments using RPI2241 and RPI369 datasets.
See Methods for definitions of performance measures.
Figure 1Performance of . Receiver operating characteristic (ROC) curves for RPI predictions, illustrating the trade-off between true positive rate and false positive rate for RPISeq-RF (random forest) and RPISeq-SVM (support vector machine) classifiers, using two datasets, RPI2241 and RPI369. The area under the curve (AUC) of each ROC is shown next to the curve. The AUC for a perfect classifier is 1, and for a random classifier = 0.5.
RPISeq predictions on NPInter dataset using RF and SVM classifiers trained on RPI2241
| Organism | Pairs predicted by RF (%) | ||
|---|---|---|---|
| 1189 | 888 (74.7) | 681 (57.3) | |
| 254 | 249 (98.0) | 252 (99.2) | |
| 120 | 98 (81.7) | 85 (70.8) | |
| 81 | 80 (98.8) | 72 (88.9) | |
| 37 | 34 (91.9) | 25 (67.6) | |
RPISeq predictions on interactions derived from the NPInter database for five model organisms.
RPISeq predictions on NPInter dataset using RF and SVM classifiers trained on RPI369
| Organism | Total RPI pairs | Pairs predicted by RF (%) | |
|---|---|---|---|
| 1189 | 808 (68.0) | 988 (83.1) | |
| 254 | 168 (66.1) | 226 (89.0) | |
| 120 | 81 (67.5) | 111 (92.5) | |
| 81 | 38 (46.9) | 53 (65.4) | |
| 37 | 20 (54.0) | 24 (64.9) | |
RPISeq predictions on interactions derived from the NPInter database for five model organisms.
Figure 2A. Predicted interactions using classifiers trained on the RPI2241 dataset. Among 254 known interactions, RPISeq-RF and RPISeq-SVM classifiers correctly predicted all except 5 and 2 edges, respectively. A protein hub, highlighted in yellow, shows interactions of a helicase (SEN1) with several snoRNAs. One of several RNA hubs, highlighted in purple, illustrates interactions of an snRNA (u4560) with various Sm-like proteins in the LSM complex.
B. Predicted interactions using classifiers trained on RPI369 dataset. Among 254 known interactions, RPISeq-RF classifier correctly predicted 168 (66%) and RPISeq-SVM correctly predicted 226 (89%). A protein hub highlighted in yellow, shows interactions of a helicase (SEN1) with 8 snoRNAs. One of several RNA hubs, highlighted in purple, illustrates interactions of an snRNA (u4560) with various Sm-like proteins in the LSM complex.
C. An enlarged view of the protein (SEN1) and RNA (snRNA) hubs described in B. above. Edges are labelled with the interaction probabilities predicted by RPISeq-RF (left) and RPISeq-SVM (right) classifiers, providing estimates of the relative pairwise interaction propensities.