Literature DB >> 1304365

Protein classification artificial neural system.

C Wu1, G Whitson, J McLarty, A Ermongkonchai, T C Chang.   

Abstract

A neural network classification method is developed as an alternative approach to the large database search/organization problem. The system, termed Protein Classification Artificial Neural System (ProCANS), has been implemented on a Cray supercomputer for rapid superfamily classification of unknown proteins based on the information content of the neural interconnections. The system employs an n-gram hashing function that is similar to the k-tuple method for sequence encoding. A collection of modular back-propagation networks is used to store the large amount of sequence patterns. The system has been trained and tested with the first 2,148 of the 8,309 entries of the annotated Protein Identification Resource protein sequence database (release 29). The entries included the electron transfer proteins and the six enzyme groups (oxidoreductases, transferases, hydrolases, lyases, isomerases, and ligases), with a total of 620 superfamilies. After a total training time of seven Cray central processing unit (CPU) hours, the system has reached a predictive accuracy of 90%. The classification is fast (i.e., 0.1 Cray CPU second per sequence), as it only involves a forward-feeding through the networks. The classification time on a full-scale system embedded with all known superfamilies is estimated to be within 1 CPU second. Although the training time will grow linearly with the number of entries, the classification time is expected to remain low even if there is a 10-100-fold increase of sequence entries. The neural database, which consists of a set of weight matrices of the networks, together with the ProCANS software, can be ported to other computers and made available to the genome community. The rapid and accurate superfamily classification would be valuable to the organization of protein sequence databases and to the gene recognition in large sequencing projects.

Mesh:

Substances:

Year:  1992        PMID: 1304365      PMCID: PMC2142223          DOI: 10.1002/pro.5560010512

Source DB:  PubMed          Journal:  Protein Sci        ISSN: 0961-8368            Impact factor:   6.725


  12 in total

1.  Basic local alignment search tool.

Authors:  S F Altschul; W Gish; W Miller; E W Myers; D J Lipman
Journal:  J Mol Biol       Date:  1990-10-05       Impact factor: 5.469

2.  Training back-propagation neural networks to define and detect DNA-binding sites.

Authors:  M C O'Neill
Journal:  Nucleic Acids Res       Date:  1991-01-25       Impact factor: 16.971

3.  K-tuple frequency analysis: from intron/exon discrimination to T-cell epitope mapping.

Authors:  J M Claverie; I Sauvaget; L Bougueleret
Journal:  Methods Enzymol       Date:  1990       Impact factor: 1.600

4.  Improvements in protein secondary structure prediction by an enhanced neural network.

Authors:  D G Kneller; F E Cohen; R Langridge
Journal:  J Mol Biol       Date:  1990-07-05       Impact factor: 5.469

5.  Rapid and sensitive protein similarity searches.

Authors:  D J Lipman; W R Pearson
Journal:  Science       Date:  1985-03-22       Impact factor: 47.728

6.  Searching through sequence databases.

Authors:  R F Doolittle
Journal:  Methods Enzymol       Date:  1990       Impact factor: 1.600

7.  Protein sequence database.

Authors:  W C Barker; D G George; L T Hunt
Journal:  Methods Enzymol       Date:  1990       Impact factor: 1.600

8.  A general method applicable to the search for similarities in the amino acid sequence of two proteins.

Authors:  S B Needleman; C D Wunsch
Journal:  J Mol Biol       Date:  1970-03       Impact factor: 5.469

9.  Use of the 'Perceptron' algorithm to distinguish translational initiation sites in E. coli.

Authors:  G D Stormo; T D Schneider; L Gold; A Ehrenfeucht
Journal:  Nucleic Acids Res       Date:  1982-05-11       Impact factor: 16.971

10.  Neural network optimization for E. coli promoter prediction.

Authors:  B Demeler; G W Zhou
Journal:  Nucleic Acids Res       Date:  1991-04-11       Impact factor: 16.971

View more
  20 in total

1.  Predicting subcellular localization of proteins for Gram-negative bacteria by support vector machines based on n-peptide compositions.

Authors:  Chin-Sheng Yu; Chih-Jen Lin; Jenn-Kang Hwang
Journal:  Protein Sci       Date:  2004-05       Impact factor: 6.725

2.  Self-organizing tree-growing network for the classification of protein sequences.

Authors:  H C Wang; J Dopazo; L G de la Fraga; Y P Zhu; J M Carazo
Journal:  Protein Sci       Date:  1998-12       Impact factor: 6.725

3.  Predicting amino acid sequences of the antibody human VH chains from its first several residues.

Authors:  B A Galitsky; I M Gelfand; A E Kister
Journal:  Proc Natl Acad Sci U S A       Date:  1998-04-28       Impact factor: 11.205

4.  A novel missense-mutation-related feature extraction scheme for 'driver' mutation identification.

Authors:  Hua Tan; Jiguang Bao; Xiaobo Zhou
Journal:  Bioinformatics       Date:  2012-10-07       Impact factor: 6.937

5.  OETMAP: a new feature encoding scheme for MHC class I binding prediction.

Authors:  Murat Gök; Ahmet Turan Özcerit
Journal:  Mol Cell Biochem       Date:  2011-07-30       Impact factor: 3.396

Review 6.  Progress in Biomedical Knowledge Discovery: A 25-year Retrospective.

Authors:  L Sacchi; J H Holmes
Journal:  Yearb Med Inform       Date:  2016-08-02

7.  The importance of physicochemical characteristics and nonlinear classifiers in determining HIV-1 protease specificity.

Authors:  Timmy Manning; Paul Walsh
Journal:  Bioengineered       Date:  2016-04-02       Impact factor: 3.269

8.  Back-propagation and counter-propagation neural networks for phylogenetic classification of ribosomal RNA sequences.

Authors:  C Wu; S Shivakumar
Journal:  Nucleic Acids Res       Date:  1994-10-11       Impact factor: 16.971

9.  Modular arrangement of proteins as inferred from analysis of homology.

Authors:  E L Sonnhammer; D Kahn
Journal:  Protein Sci       Date:  1994-03       Impact factor: 6.725

10.  Self-organized neural maps of human protein sequences.

Authors:  E A Ferrán; B Pflugfelder; P Ferrara
Journal:  Protein Sci       Date:  1994-03       Impact factor: 6.725

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.