Literature DB >> 16725034

Genome wide prediction of protein function via a generic knowledge discovery approach based on evidence integration.

Jianghui Xiong1, Simon Rayner, Kunyi Luo, Yinghui Li, Shanguang Chen.   

Abstract

BACKGROUND: The automation of many common molecular biology techniques has resulted in the accumulation of vast quantities of experimental data. One of the major challenges now facing researchers is how to process this data to yield useful information about a biological system (e.g. knowledge of genes and their products, and the biological roles of proteins, their molecular functions, localizations and interaction networks). We present a technique called Global Mapping of Unknown Proteins (GMUP) which uses the Gene Ontology Index to relate diverse sources of experimental data by creation of an abstraction layer of evidence data. This abstraction layer is used as input to a neural network which, once trained, can be used to predict function from the evidence data of unannotated proteins. The method allows us to include almost any experimental data set related to protein function, which incorporates the Gene Ontology, to our evidence data in order to seek relationships between the different sets.
RESULTS: We have demonstrated the capabilities of this method in two ways. We first collected various experimental datasets associated with yeast (Saccharomyces cerevisiae) and applied the technique to a set of previously annotated open reading frames (ORFs). These ORFs were divided into training and test sets and were used to examine the accuracy of the predictions made by our method. Then we applied GMUP to previously un-annotated ORFs and made 1980, 836 and 1969 predictions corresponding to the GO Biological Process, Molecular Function and Cellular Component sub-categories respectively. We found that GMUP was particularly successful at predicting ORFs with functions associated with the ribonucleoprotein complex, protein metabolism and transportation.
CONCLUSION: This study presents a global and generic gene knowledge discovery approach based on evidence integration of various genome-scale data. It can be used to provide insight as to how certain biological processes are implemented by interaction and coordination of proteins, which may serve as a guide for future analysis. New data can be readily incorporated as it becomes available to provide more reliable predictions or further insights into processes and interactions.

Entities:  

Mesh:

Substances:

Year:  2006        PMID: 16725034      PMCID: PMC1481625          DOI: 10.1186/1471-2105-7-268

Source DB:  PubMed          Journal:  BMC Bioinformatics        ISSN: 1471-2105            Impact factor:   3.169


  38 in total

1.  Regulatory element detection using correlation with expression.

Authors:  H J Bussemaker; H Li; E D Siggia
Journal:  Nat Genet       Date:  2001-02       Impact factor: 38.330

2.  Assessment of the reliability of protein-protein interactions and protein function prediction.

Authors:  Minghua Deng; Fengzhu Sun; Ting Chen
Journal:  Pac Symp Biocomput       Date:  2003

3.  Whole-genome annotation by using evidence integration in functional-linkage networks.

Authors:  Ulas Karaoz; T M Murali; Stan Letovsky; Yu Zheng; Chunming Ding; Charles R Cantor; Simon Kasif
Journal:  Proc Natl Acad Sci U S A       Date:  2004-02-23       Impact factor: 11.205

4.  Revealing modularity and organization in the yeast molecular network by integrated analysis of highly heterogeneous genomewide data.

Authors:  Amos Tanay; Roded Sharan; Martin Kupiec; Ron Shamir
Journal:  Proc Natl Acad Sci U S A       Date:  2004-02-18       Impact factor: 11.205

5.  Functional organization of the yeast proteome by systematic analysis of protein complexes.

Authors:  Anne-Claude Gavin; Markus Bösche; Roland Krause; Paola Grandi; Martina Marzioch; Andreas Bauer; Jörg Schultz; Jens M Rick; Anne-Marie Michon; Cristina-Maria Cruciat; Marita Remor; Christian Höfert; Malgorzata Schelder; Miro Brajenovic; Heinz Ruffner; Alejandro Merino; Karin Klein; Manuela Hudak; David Dickson; Tatjana Rudi; Volker Gnau; Angela Bauch; Sonja Bastuck; Bettina Huhse; Christina Leutwein; Marie-Anne Heurtier; Richard R Copley; Angela Edelmann; Erich Querfurth; Vladimir Rybin; Gerard Drewes; Manfred Raida; Tewis Bouwmeester; Peer Bork; Bertrand Seraphin; Bernhard Kuster; Gitte Neubauer; Giulio Superti-Furga
Journal:  Nature       Date:  2002-01-10       Impact factor: 49.962

6.  Improved tools for biological sequence comparison.

Authors:  W R Pearson; D J Lipman
Journal:  Proc Natl Acad Sci U S A       Date:  1988-04       Impact factor: 11.205

7.  MIPS: analysis and annotation of proteins from whole genomes.

Authors:  H W Mewes; C Amid; R Arnold; D Frishman; U Güldener; G Mannhaupt; M Münsterkötter; P Pagel; N Strack; V Stümpflen; J Warfsmann; A Ruepp
Journal:  Nucleic Acids Res       Date:  2004-01-01       Impact factor: 16.971

8.  Cluster analysis and display of genome-wide expression patterns.

Authors:  M B Eisen; P T Spellman; P O Brown; D Botstein
Journal:  Proc Natl Acad Sci U S A       Date:  1998-12-08       Impact factor: 11.205

9.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization.

Authors:  P T Spellman; G Sherlock; M Q Zhang; V R Iyer; K Anders; M B Eisen; P O Brown; D Botstein; B Futcher
Journal:  Mol Biol Cell       Date:  1998-12       Impact factor: 4.138

10.  Assigning function to yeast proteins by integration of technologies.

Authors:  Tony R Hazbun; Lars Malmström; Scott Anderson; Beth J Graczyk; Bethany Fox; Michael Riffle; Bryan A Sundin; J Derringer Aranda; W Hayes McDonald; Chun-Hwei Chiu; Brian E Snydsman; Phillip Bradley; Eric G D Muller; Stanley Fields; David Baker; John R Yates; Trisha N Davis
Journal:  Mol Cell       Date:  2003-12       Impact factor: 17.970

View more
  9 in total

1.  Scoring protein relationships in functional interaction networks predicted from sequence data.

Authors:  Gaston K Mazandu; Nicola J Mulder
Journal:  PLoS One       Date:  2011-04-19       Impact factor: 3.240

Review 2.  Hierarchical ensemble methods for protein function prediction.

Authors:  Giorgio Valentini
Journal:  ISRN Bioinform       Date:  2014-05-04

Review 3.  Using biological networks to improve our understanding of infectious diseases.

Authors:  Nicola J Mulder; Richard O Akinola; Gaston K Mazandu; Holifidy Rapanoel
Journal:  Comput Struct Biotechnol J       Date:  2014-08-27       Impact factor: 7.271

4.  Predicting Protein Functions Based on Differential Co-expression and Neighborhood Analysis.

Authors:  Jael Sanyanda Wekesa; Yushi Luan; Jun Meng
Journal:  J Comput Biol       Date:  2020-04-17       Impact factor: 1.479

5.  Gene function hypotheses for the Campylobacter jejuni glycome generated by a logic-based approach.

Authors:  Michael J E Sternberg; Alireza Tamaddoni-Nezhad; Victor I Lesk; Emily Kay; Paul G Hitchen; Adrian Cootes; Lieke B van Alphen; Marc P Lamoureux; Harold C Jarrell; Christopher J Rawlings; Evelyn C Soo; Christine M Szymanski; Anne Dell; Brendan W Wren; Stephen H Muggleton
Journal:  J Mol Biol       Date:  2012-10-24       Impact factor: 5.469

6.  Amino acid metabolic origin as an evolutionary influence on protein sequence in yeast.

Authors:  Benjamin L de Bivort; Ethan O Perlstein; Sam Kunes; Stuart L Schreiber
Journal:  J Mol Evol       Date:  2009-04-09       Impact factor: 2.395

7.  High-precision high-coverage functional inference from integrated data sources.

Authors:  Bolan Linghu; Evan S Snitkin; Dustin T Holloway; Adam M Gustafson; Yu Xia; Charles DeLisi
Journal:  BMC Bioinformatics       Date:  2008-02-25       Impact factor: 3.169

8.  Exploring inconsistencies in genome-wide protein function annotations: a machine learning approach.

Authors:  Carson Andorf; Drena Dobbs; Vasant Honavar
Journal:  BMC Bioinformatics       Date:  2007-08-03       Impact factor: 3.169

9.  ISOGO: Functional annotation of protein-coding splice variants.

Authors:  Juan A Ferrer-Bonsoms; Ignacio Cassol; Pablo Fernández-Acín; Carlos Castilla; Fernando Carazo; Angel Rubio
Journal:  Sci Rep       Date:  2020-01-23       Impact factor: 4.379

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.