| Literature DB >> 18990723 |
Iain Melvin1, Jason Weston, Christina Leslie, William Stafford Noble.
Abstract
UNLABELLED: We present a large-scale implementation of the Rankprop protein homology ranking algorithm in the form of an openly accessible web server. We use the NRDB40 PSI-BLAST all-versus-all protein similarity network of 1.1 million proteins to construct the graph for the Rankprop algorithm, whereas previously, results were only reported for a database of 108 000 proteins. We also describe two algorithmic improvements to the original algorithm, including propagation from multiple homologs of the query and better normalization of ranking scores, that lead to higher accuracy and to scores with a probabilistic interpretation. AVAILABILITY: The Rankprop web server and source code are available at http://rankprop.gs.washington.eduEntities:
Mesh:
Year: 2008 PMID: 18990723 PMCID: PMC2638939 DOI: 10.1093/bioinformatics/btn567
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Ranking accuracy
| Family | Family | S-Fam | S-Fam | |
|---|---|---|---|---|
| Method | ROC1 | ROC50 | ROC1 | ROC50 |
| PSI-BLAST | 0.833 | 0.851 | 0.609 | 0.628 |
| RankProp SWISSPROT | 0.816 | 0.906 | 0.592 | 0.725 |
| RankProp NRDB40 | 0.872 | 0.923 | 0.696 | 0.779 |
| RankProp+homologs NRDB40 | 0.884 | 0.928 | 0.710 | 0.775 |
*Indicate pairs of values that are not different at P < 0.01 (Wilcoxon signed rank).
Fig. 1.Combined ROC curve across multiple queries. For each method, search results from 3083 queries were sorted into a single list. The figure plots, for varying thresholds in the ranked list, the fraction of all known homologs (SCOP superfamily members) that fall above the threshold, as a function of the number of non-superfamily members above the threshold.