Literature DB >> 26044522

Graph pyramids for protein function prediction.

Tushar Sandhan, Youngjun Yoo, Jin Choi, Sun Kim.   

Abstract

BACKGROUND: Uncovering the hidden organizational characteristics and regularities among biological sequences is the key issue for detailed understanding of an underlying biological phenomenon. Thus pattern recognition from nucleic acid sequences is an important affair for protein function prediction. As proteins from the same family exhibit similar characteristics, homology based approaches predict protein functions via protein classification. But conventional classification approaches mostly rely on the global features by considering only strong protein similarity matches. This leads to significant loss of prediction accuracy.
METHODS: Here we construct the Protein-Protein Similarity (PPS) network, which captures the subtle properties of protein families. The proposed method considers the local as well as the global features, by examining the interactions among 'weakly interacting proteins' in the PPS network and by using hierarchical graph analysis via the graph pyramid. Different underlying properties of the protein families are uncovered by operating the proposed graph based features at various pyramid levels.
RESULTS: Experimental results on benchmark data sets show that the proposed hierarchical voting algorithm using graph pyramid helps to improve computational efficiency as well the protein classification accuracy. Quantitatively, among 14,086 test sequences, on an average the proposed method misclassified only 21.1 sequences whereas baseline BLAST score based global feature matching method misclassified 362.9 sequences. With each correctly classified test sequence, the fast incremental learning ability of the proposed method further enhances the training model. Thus it has achieved more than 96% protein classification accuracy using only 20% per class training data.

Entities:  

Mesh:

Substances:

Year:  2015        PMID: 26044522      PMCID: PMC4460595          DOI: 10.1186/1755-8794-8-S2-S12

Source DB:  PubMed          Journal:  BMC Med Genomics        ISSN: 1755-8794            Impact factor:   3.063


  18 in total

1.  Protein interaction maps for complete genomes based on gene fusion events.

Authors:  A J Enright; I Iliopoulos; N C Kyrpides; C A Ouzounis
Journal:  Nature       Date:  1999-11-04       Impact factor: 49.962

2.  Lethality and centrality in protein networks.

Authors:  H Jeong; S P Mason; A L Barabási; Z N Oltvai
Journal:  Nature       Date:  2001-05-03       Impact factor: 49.962

3.  A duplication growth model of gene expression networks.

Authors:  Ashish Bhan; David J Galas; T Gregory Dewey
Journal:  Bioinformatics       Date:  2002-11       Impact factor: 6.937

4.  Detection of homologous proteins by an intermediate sequence search.

Authors:  Bino John; Andrej Sali
Journal:  Protein Sci       Date:  2004-01       Impact factor: 6.725

Review 5.  Network biology: understanding the cell's functional organization.

Authors:  Albert-László Barabási; Zoltán N Oltvai
Journal:  Nat Rev Genet       Date:  2004-02       Impact factor: 53.242

6.  Hierarchical organization of modularity in metabolic networks.

Authors:  E Ravasz; A L Somera; D A Mongru; Z N Oltvai; A L Barabási
Journal:  Science       Date:  2002-08-30       Impact factor: 47.728

7.  Basic local alignment search tool.

Authors:  S F Altschul; W Gish; W Miller; E W Myers; D J Lipman
Journal:  J Mol Biol       Date:  1990-10-05       Impact factor: 5.469

8.  Selecting the right protein-scoring matrix.

Authors:  David Wheeler
Journal:  Curr Protoc Bioinformatics       Date:  2002-11

9.  A general method applicable to the search for similarities in the amino acid sequence of two proteins.

Authors:  S B Needleman; C D Wunsch
Journal:  J Mol Biol       Date:  1970-03       Impact factor: 5.469

10.  The COG database: new developments in phylogenetic classification of proteins from complete genomes.

Authors:  R L Tatusov; D A Natale; I V Garkavtsev; T A Tatusova; U T Shankavaram; B S Rao; B Kiryutin; M Y Galperin; N D Fedorova; E V Koonin
Journal:  Nucleic Acids Res       Date:  2001-01-01       Impact factor: 16.971

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.