Literature DB >> 1314686

Clustering proteins into families using artificial neural networks.

E A Ferrán1, P Ferrara.   

Abstract

An artificial neural network was used to cluster proteins into families. The network, composed of 7 x 7 neurons, was trained with the Kohonen unsupervised learning algorithm using, as inputs, matrix patterns derived from the bipeptide composition of 447 proteins, belonging to 13 different families. As a result of the training, and without any a priori indication of the number or composition of the expected families, the network self-organized the activation of its neurons into topologically ordered maps in which almost all the proteins (96.7%) were correctly clustered into the corresponding families. In a second computational experiment, a similar network was trained with one family of the previous learning set (76 cytochrome c sequences). The new neural map clustered these proteins into 25 different neurons (five in the first experiment), wherein phylogenetically related sequences were positioned close to each other. This result shows that the network can adapt the clustering resolution to the complexity of the learning set, a useful feature when working with an unknown number of clusters. Although the learning stage is time consuming, once the topological map is obtained, the classification of new proteins is very fast. Altogether, our results suggest that this novel approach may be a useful tool to organize the search for homologies in large macromolecular databases.

Entities:  

Mesh:

Substances:

Year:  1992        PMID: 1314686     DOI: 10.1093/bioinformatics/8.1.39

Source DB:  PubMed          Journal:  Comput Appl Biosci        ISSN: 0266-7061


  4 in total

1.  Topological maps of protein sequences.

Authors:  E A Ferrán; P Ferrara
Journal:  Biol Cybern       Date:  1991       Impact factor: 2.086

2.  Self-organizing tree-growing network for the classification of protein sequences.

Authors:  H C Wang; J Dopazo; L G de la Fraga; Y P Zhu; J M Carazo
Journal:  Protein Sci       Date:  1998-12       Impact factor: 6.725

3.  Self-organized neural maps of human protein sequences.

Authors:  E A Ferrán; B Pflugfelder; P Ferrara
Journal:  Protein Sci       Date:  1994-03       Impact factor: 6.725

4.  SOMEA: self-organizing map based extraction algorithm for DNA motif identification with heterogeneous model.

Authors:  Nung Kion Lee; Dianhui Wang
Journal:  BMC Bioinformatics       Date:  2011-02-15       Impact factor: 3.169

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.