Literature DB >> 12876310

Exploring the nonlinear geometry of protein homology.

Michael A Farnum1, Huafeng Xu, Dimitris K Agrafiotis.   

Abstract

The explosion of biological data resulting from genomic and proteomic research has created a pressing need for data analysis techniques that work effectively on a large scale. An area of particular interest is the organization and visualization of large families of protein sequences. An increasingly popular approach is to embed the sequences into a low-dimensional Euclidean space in a way that preserves some predefined measure of sequence similarity. This method has been shown to produce maps that exhibit global order and continuity and reveal important evolutionary, structural, and functional relationships between the embedded proteins. However, protein sequences are related by evolutionary pathways that exhibit highly nonlinear geometry, which is invisible to classical embedding procedures such as multidimensional scaling (MDS) and nonlinear mapping (NLM). Here, we describe the use of stochastic proximity embedding (SPE) for producing Euclidean maps that preserve the intrinsic dimensionality and metric structure of the data. SPE extends previous approaches in two important ways: (1) It preserves only local relationships between closely related sequences, thus allowing the map to unfold and reveal its intrinsic dimension, and (2) it scales linearly with the number of sequences and therefore can be applied to very large protein families. The merits of the algorithm are illustrated using examples from the protein kinase and nuclear hormone receptor superfamilies.

Entities:  

Mesh:

Substances:

Year:  2003        PMID: 12876310      PMCID: PMC2323947          DOI: 10.1110/ps.0379403

Source DB:  PubMed          Journal:  Protein Sci        ISSN: 0961-8368            Impact factor:   6.725


  25 in total

1.  A global geometric framework for nonlinear dimensionality reduction.

Authors:  J B Tenenbaum; V de Silva; J C Langford
Journal:  Science       Date:  2000-12-22       Impact factor: 47.728

2.  Informatics issues in large-scale sequence analysis: elucidating the protein kinases of C. elegans.

Authors:  J Bingham; G D Plowman; S Sudarsanam
Journal:  J Cell Biochem       Date:  2000-10-20       Impact factor: 4.429

3.  The PROSITE database, its status in 2002.

Authors:  Laurent Falquet; Marco Pagni; Philipp Bucher; Nicolas Hulo; Christian J A Sigrist; Kay Hofmann; Amos Bairoch
Journal:  Nucleic Acids Res       Date:  2002-01-01       Impact factor: 16.971

4.  T-Coffee: A novel method for fast and accurate multiple sequence alignment.

Authors:  C Notredame; D G Higgins; J Heringa
Journal:  J Mol Biol       Date:  2000-09-08       Impact factor: 5.469

Review 5.  The protein kinase complement of the human genome.

Authors:  G Manning; D B Whyte; R Martinez; T Hunter; S Sudarsanam
Journal:  Science       Date:  2002-12-06       Impact factor: 47.728

Review 6.  Unification of protein families.

Authors:  L Holm
Journal:  Curr Opin Struct Biol       Date:  1998-06       Impact factor: 6.809

7.  Kohonen map as a visualization tool for the analysis of protein sequences: multiple alignments, domains and segments of secondary structures.

Authors:  J Hanke; J G Reich
Journal:  Comput Appl Biosci       Date:  1996-12

8.  A new method for analyzing protein sequence relationships based on Sammon maps.

Authors:  D K Agrafiotis
Journal:  Protein Sci       Date:  1997-02       Impact factor: 6.725

9.  Self-organized neural maps of human protein sequences.

Authors:  E A Ferrán; B Pflugfelder; P Ferrara
Journal:  Protein Sci       Date:  1994-03       Impact factor: 6.725

10.  Human members of the eukaryotic protein kinase family.

Authors:  Mitch Kostich; Jessie English; Vincent Madison; Ferdous Gheyas; Luquan Wang; Ping Qiu; Jonathan Greene; Thomas M Laz
Journal:  Genome Biol       Date:  2002-08-22       Impact factor: 13.583

View more
  1 in total

1.  Molecular evolution of phosphoprotein phosphatases in Drosophila.

Authors:  Márton Miskei; Csaba Ádám; László Kovács; Zsolt Karányi; Viktor Dombrádi
Journal:  PLoS One       Date:  2011-07-15       Impact factor: 3.240

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.