| Literature DB >> 20952400 |
Xue-wen Chen1, Jong Cheol Jeong, Patrick Dermyer.
Abstract
KUPS (The University of Kansas Proteomics Service) provides high-quality protein-protein interaction (PPI) data for researchers developing and evaluating computational models for predicting PPIs by allowing users to construct ready-to-use data sets of interacting protein pairs (IPPs), non-interacting protein pairs (NIPs) and associated features. Multiple filters and options allow the user to control the make-up of the IPPs and NIPs as well as the quality of the resultant data sets. Each data set is built from the overall database, which includes 185 446 IPPs and ∼1.5 billion NIPs from five primary databases: IntAct, HPRD, MINT, UniProt and the Gene Ontology. The IPP set can be set to specific model organisms, interaction types and experimental evidence. The NIP set can be generated using four different strategies, which can alleviate biased estimation problems. Lastly, multiple features can be provided for all of the IPP and NIP pairs. Additionally, KUPS provides two benchmark data sets to help researchers compare their algorithms to existing approaches. KUPS is freely available at http://www.ittc.ku.edu/chenlab.Entities:
Mesh:
Substances:
Year: 2010 PMID: 20952400 PMCID: PMC3013794 DOI: 10.1093/nar/gkq943
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Comparing three databases
| Databases | |||
|---|---|---|---|
| Functions | Negatome | GRIP | KUPS |
| Users create IPPs? | No | Yes | Yes |
| Maximum no. of possible IPPs | 0 | 10 994 | 185 446 |
| Maximum no. of possible NIPs | 1892 | 319 855 | 1.5 billion |
| Un-biased NIPs? | Yes | No | Yes |
| No. of model organisms | NA | 1 | 8 |
| Choice of data quality and interaction types | No | No | Yes |
| Methods to create NIPs | Literature curation and structural information | Sub-cellular localization | Four strategies |
| Benchmarks | No | No | Yes |
| Feature extraction | No | No | Yes |
| Ready-to-use? | No | No | Yes |
aNegative PPIs obtained from literature curation are extracted from mammalian proteins (most from human data); and negative PPIs obtained from PDB are extracted from mammalians (47%) and other species (53%).
Figure 1.Structure of KUPS.
Figure 2.KUPS PPI distributions for model organisms.
Figure 3.Workflow diagram in KUPS.