Literature DB >> 9789096

Measurement of the effectiveness of transitive sequence comparison, through a third 'intermediate' sequence.

M Gerstein1.   

Abstract

MOTIVATION: Transitive sequence matching expands the scope of sequence comparison by re-running the results of a given query against the databank as a new query. This sometimes results in the initial query sequence (Q) being related to a final match (M) indirectly, through a third, 'intermediate' sequence (Q --> I --> M ). This approach has often been suggested as providing greater sensitivity in sequence comparison; however, it has not yet been possible to gauge its improvement precisely.
RESULTS: Here, this improvement is comprehensively measured by seeing what fraction of the known structural relationships transitive sequence matching can uncover beyond that found by normal pairwise comparison (i.e. direct linkage). The structural relationships are taken from a well-characterized test set, the scop classification of protein structure. Specifically, 2055 known structural similarities (called 'pairs') between distantly related proteins constitute the basic test set. To make the measurement of transitive matching properly, special data sets, called 'baseline sets', are derived from this. They consist of pairs of sequences that have a clear structural relationship that cannot be found by normal sequence comparison (i.e. they cannot be directly linked). Specifically, using standard sequence comparison protocols (FASTA with an e-value cut-off of 0. 001), it is found that the baseline set consists of 1742 pairs. A third intermediate sequence can link 86 of these indirectly (5%), where this third sequence is drawn from the entire, current universe of protein sequences. The number of false positives is minimal. Furthermore, when one considers only the relationships within the test set that correspond to a close structural alignment, the coverage increases considerably. In particular, 862 of the baseline set pairs fit to better than 2.6 A RMS, and transitive matching can find 62 of these (9%). AVAILABILITY: All the test data, including precise similarity values calculated from structural alignment, are available in tabular format over the Web from http://bioinfo.mbb. yale.edu/align. CONTACT: Mark.Gerstein@yale.edu

Mesh:

Substances:

Year:  1998        PMID: 9789096     DOI: 10.1093/bioinformatics/14.8.707

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  15 in total

1.  The ASTRAL compendium for protein structure and sequence analysis.

Authors:  S E Brenner; P Koehl; M Levitt
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

2.  Detection of homologous proteins by an intermediate sequence search.

Authors:  Bino John; Andrej Sali
Journal:  Protein Sci       Date:  2004-01       Impact factor: 6.725

3.  Finding weak similarities between proteins by sequence profile comparison.

Authors:  Anna R Panchenko
Journal:  Nucleic Acids Res       Date:  2003-01-15       Impact factor: 16.971

4.  The Arabidopsis cytoskeletal genome.

Authors:  Richard B Meagher; Marcus Fechheimer
Journal:  Arabidopsis Book       Date:  2003-09-30

5.  Arabidopsis and the genetic potential for the phytoremediation of toxic elemental and organic pollutants.

Authors:  Christopher S Cobbett; Richard B Meagher
Journal:  Arabidopsis Book       Date:  2002-04-04

Review 6.  Nuclear actin-related proteins as epigenetic regulators of development.

Authors:  Richard B Meagher; Roger B Deal; Muthugapatti K Kandasamy; Elizabeth C McKinney
Journal:  Plant Physiol       Date:  2005-12       Impact factor: 8.340

7.  Conserved processes and lineage-specific proteins in fungal cell wall evolution.

Authors:  Juan E Coronado; Saad Mneimneh; Susan L Epstein; Wei-Gang Qiu; Peter N Lipke
Journal:  Eukaryot Cell       Date:  2007-10-19

8.  Transitive homology-guided structural studies lead to discovery of Cro proteins with 40% sequence identity but different folds.

Authors:  Christian G Roessler; Branwen M Hall; William J Anderson; Wendy M Ingram; Sue A Roberts; William R Montfort; Matthew H J Cordes
Journal:  Proc Natl Acad Sci U S A       Date:  2008-01-28       Impact factor: 11.205

9.  Update of TTD: Therapeutic Target Database.

Authors:  Feng Zhu; BuCong Han; Pankaj Kumar; XiangHui Liu; XiaoHua Ma; Xiaona Wei; Lu Huang; YangFan Guo; LianYi Han; ChanJuan Zheng; YuZong Chen
Journal:  Nucleic Acids Res       Date:  2009-11-20       Impact factor: 16.971

10.  Functional classification of immune regulatory proteins.

Authors:  Rotem Rubinstein; Udupi A Ramagopal; Stanley G Nathenson; Steven C Almo; Andras Fiser
Journal:  Structure       Date:  2013-04-11       Impact factor: 5.006

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.