| Literature DB >> 24780077 |
Carmen M Livi1, Enrico Blanzieri.
Abstract
BACKGROUND: RNA-binding proteins interact with specific RNA molecules to regulate important cellular processes. It is therefore necessary to identify the RNA interaction partners in order to understand the precise functions of such proteins. Protein-RNA interactions are typically characterized using in vivo and in vitro experiments but these may not detect all binding partners. Therefore, computational methods that capture the protein-dependent nature of such binding interactions could help to predict potential binding partners in silico.Entities:
Mesh:
Substances:
Year: 2014 PMID: 24780077 PMCID: PMC4098778 DOI: 10.1186/1471-2105-15-123
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Dataset description
| | | | | |
| | RBP+ | 15 | 8086 | positive dataset |
| | - | 3000 | negative dataset | |
| | | | | |
| | 1 | 2151 | positive dataset | |
| | 1 | 3000 | negative dataset | |
| | | | | |
| | | 1 | 5951 | positive dataset |
| 1 | 5841 | negative dataset |
A short description of the dataset compositions, the number of proteins in each dataset and the sum of the target sequences. A more detailed description of the AURA_dataset and the number of target sequences used for training can be found in Table 2, first and second column.
Performance of , , , and on the
| AGO1 | 1824 | 0.86 | 0.85 | 0.84 | 0.83 | 0.74 | 0.62 |
| AGO2 | 207 | 0.84 | 0.83 | 0.70 | 0.80 | 0.7 | 0.61 |
| AGO4 | 270 | 0.87 | 0.84 | 0.78 | 0.82 | 0.76 | 0.62 |
| AUF1 | 1319 | 0.69 | 0.69 | 0.67 | 0.62 | 0.57 | 0.6 |
| CPEB1 | 182 | 0.69 | 0.67 | 0.59 | 0.55 | 0.62 | 0.53 |
| CPEB4 | 72 | 0.52 | 0.54 | 0.60 | 0.50 | 0.54 | 0.52 |
| CUGBP1 | 195 | 0.78 | 0.78 | 0.65 | 0.72 | 0.72 | 0.6 |
| ELAVL1 | 1262 | 0.73 | 0.73 | 0.69 | 0.68 | 0.6 | 0.61 |
| PUM1 | 420 | 0.68 | 0.68 | 0.66 | 0.68 | 0.67 | 0.64 |
| PABP | 258 | 0.57 | 0.58 | 0.52 | 0.52 | 0.52 | 0.51 |
| QKI | 710 | 0.87 | 0.86 | 0.86 | 0.83 | 0.78 | 0.76 |
| TNRC6A | 246 | 0.87 | 0.83 | 0.79 | 0.82 | 0.67 | 0.67 |
| TNRC6B | 742 | 0.86 | 0.86 | 0.82 | 0.83 | 0.70 | 0.68 |
| TNRC6C | 151 | 0.80 | 0.80 | 0.68 | 0.77 | 0.70 | 0.61 |
| U2AF65 | 228 | 0.73 | 0.73 | 0.67 | 0.71 | 0.64 | 0.64 |
| Mean ±sd | 0.75 ±0.11 | 0.75 ±0.10 | 0.70 ±0.09 | 0.71 ±0.11 | 0.66 ±0.07 | 0.61 ±0.06 |
The table lists RBPs, the number of sequences and the AUCs for each method on the AURA_dataset. The AUCs are calculated in 10-fold cross validations and at a sequence identity of 80%. The negatives are provided in all cases by 3K- (see Evaluation Evaluation 1). Data are reported with means ± standard deviation (sd).
Figure 1ROC curve showing the performance of tetranucleotide frequency-based discrimination. The ROC curves show the performance in 10-fold cross validations for the Oli method on the AURA_dataset and on PUM2+. The negative data are in both cases provided by 3K-. The further the ROC curve advances towards the upper-left corner, the better the classification ability of the model. A curve near the 45-degree diagonal reflects a random classification.
Performance of , , , and both methods on in combination with two different negative datasets
| AUC | 0.77 | 0.77 | 0.74 | 0.75 | 0.73 | 0.52 | ||
| | | Prec | 0.73 | 0.73 | 0.69 | 0.59 | 0.48 | 0.42 |
| AUC | 0.84 | 0.84 | 0.82 | 0.83 | 0.77 | 0.56 | ||
| Prec | 0.80 | 0.80 | 0.74 | 0.68 | 0.47 | 0.40 |
The table shows the performance for each method on two different datasets: one with PUM2+ and experimental negatives PUM2- and one with PUM2+ and randomly selected 3’-UTRs 3K-. AUC and precision (Prec) values were calculated for a 10-fold cross validation.
Figure 2Precision-recall curves of and. The PR curves show the performance of Oli method (red line) and RNAcontext (green line) on experimental data from PUM2+ and PUM2- using 10-fold cross validations. The further the curve advances to the upper-right corner, the better the classification ability of the model. A more detailed explanation is provided in the Additional file 3.