| Literature DB >> 24214996 |
Philipp Blohm1, Goar Frishman, Pawel Smialowski, Florian Goebels, Benedikt Wachinger, Andreas Ruepp, Dmitrij Frishman.
Abstract
Knowledge about non-interacting proteins (NIPs) is important for training the algorithms to predict protein-protein interactions (PPIs) and for assessing the false positive rates of PPI detection efforts. We present the second version of Negatome, a database of proteins and protein domains that are unlikely to engage in physical interactions (available online at http://mips.helmholtz-muenchen.de/proj/ppi/negatome). Negatome is derived by manual curation of literature and by analyzing three-dimensional structures of protein complexes. The main methodological innovation in Negatome 2.0 is the utilization of an advanced text mining procedure to guide the manual annotation process. Potential non-interactions were identified by a modified version of Excerbt, a text mining tool based on semantic sentence analysis. Manual verification shows that nearly a half of the text mining results with the highest confidence values correspond to NIP pairs. Compared to the first version the contents of the database have grown by over 300%.Entities:
Mesh:
Year: 2013 PMID: 24214996 PMCID: PMC3965096 DOI: 10.1093/nar/gkt1079
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Content of the Negatome 2.0 database
| Dataset name | Derived from | Description | Number of pairs |
|---|---|---|---|
| PDB | The PDB database | Protein pairs that are members of at least one structural complex but do not interact directly. | 4397 |
| PDB-stringent | PDB | The PDB dataset filtered against the IntAct dataset. | 4161 |
| PDB-PFAM | PDB-stringent | Non-interacting PFAM domains found in the same structural complex | 1234 |
| Manual | Manual literature annotation | Manually annotated literature data describing the lack of protein interaction. High-throughput data are not included. | 2171 |
| Manual-stringent | Manual | The Manual dataset filtered against the IntAct dataset. | 1991 |
| Manual-PFAM | Manual-stringent | PFAM domain pairs found in the Manual dataset | 1453 |
Figure 1.Manual assessment of the text mining performance. The figure shows the number of sentences proposed by the text mining system that were tagged as containing a negative interaction by a human expert (acceptance rate) and the number of negative interactions by the human expert from other sentences stemming from the paper selected by the text mining system (addition rate). Both rates are displayed in relation to the confidence score that was calculated for the text mining results.
Figure 2.Flowchart explaining how Negatome 2.0 data are generated, merged with Negatome 1.0 and filtered against known interactions to produce stringent datasets.