| Literature DB >> 22146221 |
Tian Mi1, Jerlin Camilus Merlin, Sandeep Deverasetty, Michael R Gryk, Travis J Bill, Andrew W Brooks, Logan Y Lee, Viraj Rathnayake, Christian A Ross, David P Sargeant, Christy L Strong, Paula Watts, Sanguthevar Rajasekaran, Martin R Schiller.
Abstract
Minimotif Miner (MnM available at http://minimotifminer.org or http://mnm.engr.uconn.edu) is an online database for identifying new minimotifs in protein queries. Minimotifs are short contiguous peptide sequences that have a known function in at least one protein. Here we report the third release of the MnM database which has now grown 60-fold to approximately 300,000 minimotifs. Since short minimotifs are by their nature not very complex we also summarize a new set of false-positive filters and linear regression scoring that vastly enhance minimotif prediction accuracy on a test data set. This online database can be used to predict new functions in proteins and causes of disease.Entities:
Mesh:
Substances:
Year: 2011 PMID: 22146221 PMCID: PMC3245078 DOI: 10.1093/nar/gkr1189
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Revised minimotif model. The key elements of the minimotif syntax are colored blue. Orange boxes indicate attributes that are unique to specific minimotif triplets. Yellow ovals are for different attributes of minimotif triplet elements. All attributes except those in the purple boxes were previously described in our minimotif model and the purple boxes are new attributes to define motif modifications and activity modifications (12).
Growth of minimotif entries in MnM
| Category | MnM | MnM 2 | MnM 3 |
|---|---|---|---|
| Total | |||
| Motif sequences | 462 | 5089 | 294 933 |
| Consensus sequences | 312 | 858 | 880 |
| Instance sequences | 44 | 4229 | 294 053 |
| Post-translational modifications | 116 | 663 | 210 949 |
| Binding | 162 | 4689 | 4922 |
| Trafficking | 34 | 195 | 228 |
| Required for cell process | – | – | 47 |
| Unique | |||
| Motif sequences | 312 | 2224 | 185 833 |
| Motif proteins | <312 | 1211 | 49 671 |
| Motif targets | <312 | 687 | 2620 |
Comparison of different minimotif filters
| Minimotif filter | Area under ROC curve | Discrimination ratio | Reference | |
|---|---|---|---|---|
| Frequency score | 0.7 | 0.08 | ND | ( |
| Cellular function | 0.7 | 0.12 | 4.6 | ( |
| Cellular function + frequency score | 0.9 | 0.0002 | – | ( |
| Molecular function | 0.8 | 0.03 | 2.9 | ( |
| Molecular function + frequency score | 0.9 | 0.002 | – | ( |
| Protein–protein interaction | 0.9 | 0.001 | 12.5 | ( |
| Genetic interaction | – | – | 7.3 | Submitted for publication |
| Surface prediction filter | 0.3 | 1 | – | ( |
| Multifilter | 0.94 | 9.7 | – | Submitted for publication |
Figure 2.Screenshot of minimotif filter selection page. Screenshot of MnM 3 filter section for choosing approaches for filtering out false-positive minimotifs.
Figure 3.Results table in MnM 3 from analysis of HIV-1 Nef protein. Nef (NP_057857) was analyzed to produce the minimotif predictions shown. Column 1 shows the minimotif sequence, column 2 shows the function of the minimotif, column 3 shows the amino acid position(s) for the start residue in the minimotif, column 4 shows the combined filter score, and column 5 shows the number of occurrences of each motif in the entire HIV-1 proteome. Rows colored blue are for minimotifs that are experimentally validated, yellow are above a threshold for high accuracy prediction, and red are below this threshold or do not have data to calculate a combined filter score (null).