Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Measurement of the effectiveness of transitive sequence comparison, through a third 'intermediate' sequence.

Literature DB >> 9789096

Measurement of the effectiveness of transitive sequence comparison, through a third 'intermediate' sequence.

Abstract

MOTIVATION: Transitive sequence matching expands the scope of sequence comparison by re-running the results of a given query against the databank as a new query. This sometimes results in the initial query sequence (Q) being related to a final match (M) indirectly, through a third, 'intermediate' sequence (Q --> I --> M ). This approach has often been suggested as providing greater sensitivity in sequence comparison; however, it has not yet been possible to gauge its improvement precisely.
RESULTS: Here, this improvement is comprehensively measured by seeing what fraction of the known structural relationships transitive sequence matching can uncover beyond that found by normal pairwise comparison (i.e. direct linkage). The structural relationships are taken from a well-characterized test set, the scop classification of protein structure. Specifically, 2055 known structural similarities (called 'pairs') between distantly related proteins constitute the basic test set. To make the measurement of transitive matching properly, special data sets, called 'baseline sets', are derived from this. They consist of pairs of sequences that have a clear structural relationship that cannot be found by normal sequence comparison (i.e. they cannot be directly linked). Specifically, using standard sequence comparison protocols (FASTA with an e-value cut-off of 0. 001), it is found that the baseline set consists of 1742 pairs. A third intermediate sequence can link 86 of these indirectly (5%), where this third sequence is drawn from the entire, current universe of protein sequences. The number of false positives is minimal. Furthermore, when one considers only the relationships within the test set that correspond to a close structural alignment, the coverage increases considerably. In particular, 862 of the baseline set pairs fit to better than 2.6 A RMS, and transitive matching can find 62 of these (9%). AVAILABILITY: All the test data, including precise similarity values calculated from structural alignment, are available in tabular format over the Web from http://bioinfo.mbb. yale.edu/align. CONTACT: Mark.Gerstein@yale.edu

Mesh：

Substances：
Proteins

Year: 1998 PMID： 9789096 DOI： 10.1093/bioinformatics/14.8.707

Source DB: PubMed Journal: Bioinformatics ISSN： 1367-4803 Impact factor: 6.937

Keyword Cloud
Cited

15 in total

1. The ASTRAL compendium for protein structure and sequence analysis.

Authors: S E Brenner; P Koehl; M Levitt
Journal: Nucleic Acids Res Date: 2000-01-01 Impact factor: 16.971

2. Detection of homologous proteins by an intermediate sequence search.

Authors: Bino John; Andrej Sali
Journal: Protein Sci Date: 2004-01 Impact factor: 6.725

3. Finding weak similarities between proteins by sequence profile comparison.

Authors: Anna R Panchenko
Journal: Nucleic Acids Res Date: 2003-01-15 Impact factor: 16.971

4. The Arabidopsis cytoskeletal genome.

Authors: Richard B Meagher; Marcus Fechheimer
Journal: Arabidopsis Book Date: 2003-09-30

5. Arabidopsis and the genetic potential for the phytoremediation of toxic elemental and organic pollutants.

Authors: Christopher S Cobbett; Richard B Meagher
Journal: Arabidopsis Book Date: 2002-04-04

Review 6. Nuclear actin-related proteins as epigenetic regulators of development.

Authors: Richard B Meagher; Roger B Deal; Muthugapatti K Kandasamy; Elizabeth C McKinney
Journal: Plant Physiol Date: 2005-12 Impact factor: 8.340

7. Conserved processes and lineage-specific proteins in fungal cell wall evolution.

Authors: Juan E Coronado; Saad Mneimneh; Susan L Epstein; Wei-Gang Qiu; Peter N Lipke
Journal: Eukaryot Cell Date: 2007-10-19

8. Transitive homology-guided structural studies lead to discovery of Cro proteins with 40% sequence identity but different folds.

Authors: Christian G Roessler; Branwen M Hall; William J Anderson; Wendy M Ingram; Sue A Roberts; William R Montfort; Matthew H J Cordes
Journal: Proc Natl Acad Sci U S A Date: 2008-01-28 Impact factor: 11.205

9. Update of TTD: Therapeutic Target Database.

Authors: Feng Zhu; BuCong Han; Pankaj Kumar; XiangHui Liu; XiaoHua Ma; Xiaona Wei; Lu Huang; YangFan Guo; LianYi Han; ChanJuan Zheng; YuZong Chen
Journal: Nucleic Acids Res Date: 2009-11-20 Impact factor: 16.971

10. Functional classification of immune regulatory proteins.

Authors: Rotem Rubinstein; Udupi A Ramagopal; Stanley G Nathenson; Steven C Almo; Andras Fiser
Journal: Structure Date: 2013-04-11 Impact factor: 5.006