| Literature DB >> 20565705 |
Jay Vyas1, Ronald J Nowling, Thomas Meusburger, David Sargeant, Krishna Kadaveru, Michael R Gryk, Vamsi Kundeti, Sanguthevar Rajasekaran, Martin R Schiller.
Abstract
BACKGROUND: Minimotifs are short peptide sequences within one protein, which are recognized by other proteins or molecules. While there are now several minimotif databases, they are incomplete. There are reports of many minimotifs in the primary literature, which have yet to be annotated, while entirely novel minimotifs continue to be published on a weekly basis. Our recently proposed function and sequence syntax for minimotifs enables us to build a general tool that will facilitate structured annotation and management of minimotif data from the biomedical literature.Entities:
Mesh:
Substances:
Year: 2010 PMID: 20565705 PMCID: PMC2905367 DOI: 10.1186/1471-2105-11-328
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1General architecture of MimoSA.
Figure 2Screen shots of MimoSA application database management windows. A. Motif Browser shows attributes of all minimotifs in the MnM database. B. Minimotif Data editing or entry form for entering information in the MnM database. C. Modification form for entering minimotif modification or activity modification attributes into MnM D. Attribute for adding or editing supporting experimental techniques, structure accession numbers, and other auxiliary attributes including comments about the annotation.
Figure 3Screenshot of MimoSA abstract and protein sequence viewers. A. Abstract Viewer shows the abstract of the paper selected. Words that match existing minimotif attributes are color-coded. B. Protein sequence viewer window shows the sequence of the protein having the accession number entered in the Minimotif Data entry form. Any minimotif entered in this form is highlighted in the sequence.
Figure 4Screenshot of MimoSA paper browser and paper tracking windows. A. Paper Browser Window allows display of attributes for all papers in the MnM database. B. Paper Review Status Window shows review event history of papers in the database.
Paper tracking status definitions
| Name | Description |
|---|---|
| not reviewed | For papers that have not yet been received. |
| Reviewed no minimotifs | For papers that were reviewed and do not contain minimotifs. |
| reviewed for some minimotifs | For papers that were reviewed, but for which not all of the minimotifs from the paper have been annotated. |
| reviewed for all minimotifs | For papers that were reviewed and and for which all minimotifs have been annotated. |
| group review | For papers with questionable interpretation that require discussion by the annotation group. |
| no electronic version | For papers for which an electronic version is not available. |
| minimotif present but not annotated | For papers that have a minimotif, but have not yet been annotated. |
Larger training set sizes (negative, positive) modestly improve algorithm performance
| Negative Papers | Positive Papers | Paper Score |
|---|---|---|
| 10 | 100 | 0.60 |
| 20 | 100 | 0.63 |
| 30 | 100 | 0.63 |
| 40 | 100 | 0.64 |
| 10 | 200 | 0.56 |
| 20 | 200 | 0.59 |
| 30 | 200 | 0.58 |
| 40 | 200 | 0.60 |
| 10 | 300 | 0.60 |
| 20 | 300 | 0.63 |
| 30 | 300 | 0.64 |
| 40 | 300 | 0.66 |
| 10 | 400 | 0.61 |
| 20 | 400 | 0.65 |
| 30 | 400 | 0.66 |
| 40 | 400 | 0.66 |
Figure 5ROC curve analysis of TextMine results. ROC curve as a measurement of the sensitivity and specificity of TextMine for a disjoint test set of 91 pre-scored papers. Area under curve = 0.89.