Literature DB >> 8771198

Self-organizing hierarchic networks for pattern recognition in protein sequence.

J Hanke1, G Beckmann, P Bork, J G Reich.   

Abstract

We present a method based on hierarchical self-organizing maps (SOMs) for recognizing patterns in protein sequences. The method is fully automatic, does not require prealigned sequences, is insensitive to redundancy in the training set, and works surprisingly well even with small learning sets. Because it uses unsupervised neural networks, it is able to extract patterns that are not present in all of the unaligned sequences of the learning set. The identification of these patterns in sequence databases is sensitive and efficient. The procedure comprises three main training stages. In the first stage, one SOM is trained to extract common features from the set of unaligned learning sequences. A feature is a number of ungapped sequence segments (usually 4-16 residues long) that are similar to segments in most of the sequences of the learning set according to an initial similarity matrix. In the second training stage, the recognition of each individual feature is refined by selecting an optimal weighting matrix out of a variety of existing amino acid similarity matrices. In a third stage of the SOM procedure, the position of the features in the individual sequences is learned. This allows for variants with feature repeats and feature shuffling. The procedure has been successfully applied to a number of notoriously difficult cases with distinct recognition problems: helix-turn-helix motifs in DNA-binding proteins, the CUB domain of developmentally regulated proteins, and the superfamily of ribokinases. A comparison with the established database search procedure PROFILE (and with several others) led to the conclusion that the new automatic method performs satisfactorily.

Mesh:

Year:  1996        PMID: 8771198      PMCID: PMC2143234          DOI: 10.1002/pro.5560050109

Source DB:  PubMed          Journal:  Protein Sci        ISSN: 0961-8368            Impact factor:   6.725


  31 in total

Review 1.  The helix-turn-helix DNA binding motif.

Authors:  R G Brennan; B W Matthews
Journal:  J Biol Chem       Date:  1989-02-05       Impact factor: 5.157

2.  Finding sequence motifs in groups of functionally related proteins.

Authors:  H O Smith; T M Annau; S Chandrasegaran
Journal:  Proc Natl Acad Sci U S A       Date:  1990-01       Impact factor: 11.205

3.  Identification of protein sequence homology by consensus template alignment.

Authors:  W R Taylor
Journal:  J Mol Biol       Date:  1986-03-20       Impact factor: 5.469

4.  Prediction of beta-turns in proteins using neural networks.

Authors:  M J McGregor; T P Flores; M J Sternberg
Journal:  Protein Eng       Date:  1989-05

5.  Predicting the secondary structure of globular proteins using neural network models.

Authors:  N Qian; T J Sejnowski
Journal:  J Mol Biol       Date:  1988-08-20       Impact factor: 5.469

6.  Improved tools for biological sequence comparison.

Authors:  W R Pearson; D J Lipman
Journal:  Proc Natl Acad Sci U S A       Date:  1988-04       Impact factor: 11.205

7.  The prediction of helix-turn-helix DNA-binding regions in proteins.

Authors:  M D Yudkin
Journal:  Protein Eng       Date:  1987 Oct-Nov

8.  The prediction of helix-turn-helix DNA-binding regions in proteins. A reply to Yudkin.

Authors:  I B Dodd; J B Egan
Journal:  Protein Eng       Date:  1988-09

9.  Protein secondary structure and homology by neural networks. The alpha-helices in rhodopsin.

Authors:  H Bohr; J Bohr; S Brunak; R M Cotterill; B Lautrup; L Nørskov; O H Olsen; S B Petersen
Journal:  FEBS Lett       Date:  1988-12-05       Impact factor: 4.124

10.  Detection of conserved segments in proteins: iterative scanning of sequence databases with alignment blocks.

Authors:  R L Tatusov; S F Altschul; E V Koonin
Journal:  Proc Natl Acad Sci U S A       Date:  1994-12-06       Impact factor: 11.205

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.