| Literature DB >> 20808856 |
Sanguthevar Rajasekaran1, Tian Mi, Jerlin Camilus Merlin, Aaron Oommen, Patrick Gradie, Martin R Schiller.
Abstract
BACKGROUND: Minimotifs are short contiguous peptide sequences in proteins that are known to have a function in at least one other protein. One of the principal limitations in minimotif prediction is that false positives limit the usefulness of this approach. As a step toward resolving this problem we have built, implemented, and tested a new data-driven algorithm that reduces false-positive predictions. METHODOLOGY/PRINCIPALEntities:
Mesh:
Substances:
Year: 2010 PMID: 20808856 PMCID: PMC2924378 DOI: 10.1371/journal.pone.0012276
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Evaluation of the cellular function filter algorithm.
| Distance | sensitivity | selectivity | DR |
| 0 | 11% | 3% | 3.8 |
| 1 | 26% | 6% | 4.6 |
| 2 | 48% | 14% | 3.4 |
| 3 | 65% | 32% | 2.0 |
| 4 | 82% | 58% | 1.4 |
| 5 | 90% | 79% | 1.2 |
Figure 1ROC curves for minimotif filters.
ROC curves for the molecular (A) and cellular (B) function filters, as well as the frequency score filter are shown. Analysis was with the minimotifs in the MnM 2 database that have known molecular and cellular functions in the GO database (A,B).
Evaluation of the molecular function filter algorithm.
| distance | sensitivity | selectivity | DR |
| 0 | 29% | 12% | 2.3 |
| 1 | 59% | 21% | 2.9 |
| 2 | 82% | 35% | 2.3 |
| 3 | 91% | 50% | 1.8 |
| 4 | 94% | 61% | 1.6 |
| 5 | 96% | 72% | 1.3 |
Statistics for comparison of functional filters to the Frequency Score filter.
| Cellular Function | Molecular Function | Frequency Score | MF-FS Combination | CF-FS Combination | |
|
| 0.72 | 0.83 | 0.72 | 0.89 | 0.87 |
|
| 0.12 | 0.03 | 0.08 | 0.002 | 0.0002 |
Evaluation of the molecular function – frequency score combined filter.
| thresholds | ||||
| 0.02 | 0.03 | 0.04 | ||
|
|
| 28% | 28% | 28% |
|
| 63% | 63% | 63% | |
|
| 88% | 88% | 88% | |
|
|
| 19% | 16% | 15% |
|
| 27% | 24% | 23% | |
|
| 41% | 39% | 38% | |
Figure 2ROC curve for the combined filters.
Combination of molecular function and frequency score filters (A) and combination of cellular function and frequency score filters (B) are shown. These ROC curves have been obtained by combining the two pairs of filters on an either-or basis.
Evaluation of the cellular function – frequency score combined filter.
| thresholds | ||||
| 0.02 | 0.03 | 0.04 | ||
|
|
| 17% | 17% | 17% |
|
| 44% | 44% | 44% | |
|
| 75% | 75% | 75% | |
|
| 88% | 88% | 88% | |
|
| 95% | 95% | 95% | |
|
|
| 9% | 6% | 5% |
|
| 12% | 9% | 8% | |
|
| 20% | 17% | 16% | |
|
| 37% | 34% | 34% | |
|
| 62% | 60% | 60% | |
Figure 3Image of the filter selector on the MnM website.
All filters in this paper are now included as part of the MnM website. The option to select minimotifs that have similar or dissimilar functions is implemented.
Analysis of novel queries with the cellular and molecular function filters.
| Cellular function | Molec. function | |||||
| Protein | RefSeq | Threshold |
| Retained |
| Retained |
| p53 | NP_035770 | 0 | 64 | 10 | 67 | 46 |
| p53 | NP_035770 | 1 | 64 | 33 | 67 | 53 |
| p53 | NP_035770 | 2 | 64 | 52 | 67 | 63 |
| p53 | NP_035770 | 3 | 64 | 61 | 67 | 64 |
| p53 | NP_035770 | 4 | 64 | 64 | 67 | 65 |
| p53 | NP_035770 | 5 | 64 | 64 | 67 | 65 |
| Cyclin A | NP_003905 | 0 | 81 | 3 | 82 | 38 |
| Cyclin A | NP_003905 | 1 | 81 | 6 | 82 | 51 |
| Cyclin A | NP_003905 | 2 | 81 | 23 | 82 | 65 |
| Cyclin A | NP_003905 | 3 | 81 | 40 | 82 | 69 |
| Cyclin A | NP_003905 | 4 | 81 | 64 | 82 | 72 |
| Cyclin A | NP_003905 | 5 | 81 | 77 | 82 | 75 |
| MSH2 | NP_000242 | 0 | 76 | 8 | 80 | 25 |
| MSH2 | NP_000242 | 1 | 76 | 15 | 80 | 52 |
| MSH2 | NP_000242 | 2 | 76 | 34 | 80 | 66 |
| MSH2 | NP_000242 | 3 | 76 | 62 | 80 | 74 |
| MSH2 | NP_000242 | 4 | 76 | 73 | 80 | 76 |
| MSH2 | NP_000242 | 5 | 76 | 75 | 80 | 77 |
*Totals do not include minimotifs for which no GO terms are assigned to the proteins.