| Literature DB >> 22539666 |
Weizhong Li1, Hamish McWilliam, Mickael Goujon, Andrew Cowley, Rodrigo Lopez, William R Pearson.
Abstract
UNLABELLED: Iterative similarity searches with PSI-BLAST position-specific score matrices (PSSMs) find many more homologs than single searches, but PSSMs can be contaminated when homologous alignments are extended into unrelated protein domains-homologous over-extension (HOE). PSI-Search combines an optimal Smith-Waterman local alignment sequence search, using SSEARCH, with the PSI-BLAST profile construction strategy. An optional sequence boundary-masking procedure, which prevents alignments from being extended after they are initially included, can reduce HOE errors in the PSSM profile. Preventing HOE improves selectivity for both PSI-BLAST and PSI-Search, but PSI-Search has ~4-fold better selectivity than PSI-BLAST and similar sensitivity at 50% and 60% family coverage. PSI-Search is also produces 2- for 4-fold fewer false-positives than JackHMMER, but is ~5% less sensitive.Entities:
Mesh:
Year: 2012 PMID: 22539666 PMCID: PMC3371869 DOI: 10.1093/bioinformatics/bts240
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.(a) HOE-reduced PSI-Search iteration workflow. (b) Fraction of true-positives versus false-positives found by PSI-BLAST, PSI-BLAST HOE-reduced, PSI-Search, PSI-Search HOE-reduced, and JackHMMER. Weighted true-positives and false-positives are calculated as 1/500∑5001 tp (or fp)/total where tp (or fp) is the number of true positives (or false positives) at iteration 5 and total is the total number of homologs for query f in the RefProtDom benchmark database. Alignments containing HOEs with >50% of the alignment outside the homologous boundary are counted as both true and false positives