Literature DB >> 1958730

Topological maps of protein sequences.

E A Ferrán1, P Ferrara.   

Abstract

A new method based on neural networks to cluster proteins into families is described. The network is trained with the Kohonen unsupervised learning algorithm, using matrix pattern representations of the protein sequences as inputs. The components (x, y) of these 20 x 20 matrix patterns are the normalized frequencies of all pairs xy of amino acids in each sequence. We investigate the influence of different learning parameters in the final topological maps obtained with a learning set of ten proteins belonging to three established families. In all cases, except in those where the synaptic vectors remains nearly unchanged during learning, the ten proteins are correctly classified into the expected families. The classification by the trained network of mutated or incomplete sequences of the learned proteins is also analysed. The neural network gives a correct classification for a sequence mutated in 21.5% +/- 7% of its amino acids and for fragments representing 7.5% +/- 3% of the original sequence. Similar results were obtained with a learning set of 32 proteins belonging to 15 families. These results show that a neural network can be trained following the Kohonen algorithm to obtain topological maps of protein sequences, where related proteins are finally associated to the same winner neuron or to neighboring ones, and that the trained network can be applied to rapidly classify new sequences. This approach opens new possibilities to find rapid and efficient algorithms to organize and search for homologies in the whole protein database.

Entities:  

Mesh:

Substances:

Year:  1991        PMID: 1958730     DOI: 10.1007/bf00204658

Source DB:  PubMed          Journal:  Biol Cybern        ISSN: 0340-1200            Impact factor:   2.086


  14 in total

1.  Clustering proteins into families using artificial neural networks.

Authors:  E A Ferrán; P Ferrara
Journal:  Comput Appl Biosci       Date:  1992-02

Review 2.  The human genome project: past, present, and future.

Authors:  J D Watson
Journal:  Science       Date:  1990-04-06       Impact factor: 47.728

3.  Protein database searches for multiple alignments.

Authors:  S F Altschul; D J Lipman
Journal:  Proc Natl Acad Sci U S A       Date:  1990-07       Impact factor: 11.205

4.  Predicting the secondary structure of globular proteins using neural network models.

Authors:  N Qian; T J Sejnowski
Journal:  J Mol Biol       Date:  1988-08-20       Impact factor: 5.469

5.  Method for clustering proteins by use of all possible pairs of amino acids as structural descriptors.

Authors:  S Nakayama; S Shigezumi; M Yoshida
Journal:  J Chem Inf Comput Sci       Date:  1988-05

6.  A general method applicable to the search for similarities in the amino acid sequence of two proteins.

Authors:  S B Needleman; C D Wunsch
Journal:  J Mol Biol       Date:  1970-03       Impact factor: 5.469

7.  A comprehensive set of sequence analysis programs for the VAX.

Authors:  J Devereux; P Haeberli; O Smithies
Journal:  Nucleic Acids Res       Date:  1984-01-11       Impact factor: 16.971

8.  Pattern recognition in several sequences: consensus and alignment.

Authors:  M S Waterman; R Arratia; D J Galas
Journal:  Bull Math Biol       Date:  1984       Impact factor: 1.758

9.  Rapid similarity searches of nucleic acid and protein data banks.

Authors:  W J Wilbur; D J Lipman
Journal:  Proc Natl Acad Sci U S A       Date:  1983-02       Impact factor: 11.205

10.  A new family of powerful multivariate statistical sequence analysis techniques.

Authors:  M van Heel
Journal:  J Mol Biol       Date:  1991-08-20       Impact factor: 5.469

View more
  4 in total

1.  SOMMER: self-organising maps for education and research.

Authors:  Michael Schmuker; Florian Schwarte; André Brück; Ewgenij Proschak; Yusuf Tanrikulu; Alireza Givehchi; Kai Scheiffele; Gisbert Schneider
Journal:  J Mol Model       Date:  2006-09-22       Impact factor: 1.810

2.  Self-organizing tree-growing network for the classification of protein sequences.

Authors:  H C Wang; J Dopazo; L G de la Fraga; Y P Zhu; J M Carazo
Journal:  Protein Sci       Date:  1998-12       Impact factor: 6.725

3.  Self-organizing hierarchic networks for pattern recognition in protein sequence.

Authors:  J Hanke; G Beckmann; P Bork; J G Reich
Journal:  Protein Sci       Date:  1996-01       Impact factor: 6.725

4.  Self-organized neural maps of human protein sequences.

Authors:  E A Ferrán; B Pflugfelder; P Ferrara
Journal:  Protein Sci       Date:  1994-03       Impact factor: 6.725

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.