Literature DB >> 23511543

Parametric Bayesian priors and better choice of negative examples improve protein function prediction.

Noah Youngs1, Duncan Penfold-Brown, Kevin Drew, Dennis Shasha, Richard Bonneau.   

Abstract

MOTIVATION: Computational biologists have demonstrated the utility of using machine learning methods to predict protein function from an integration of multiple genome-wide data types. Yet, even the best performing function prediction algorithms rely on heuristics for important components of the algorithm, such as choosing negative examples (proteins without a given function) or determining key parameters. The improper choice of negative examples, in particular, can hamper the accuracy of protein function prediction.
RESULTS: We present a novel approach for choosing negative examples, using a parameterizable Bayesian prior computed from all observed annotation data, which also generates priors used during function prediction. We incorporate this new method into the GeneMANIA function prediction algorithm and demonstrate improved accuracy of our algorithm over current top-performing function prediction methods on the yeast and mouse proteomes across all metrics tested. AVAILABILITY: Code and Data are available at: http://bonneaulab.bio.nyu.edu/funcprop.html

Entities:  

Mesh:

Substances:

Year:  2013        PMID: 23511543      PMCID: PMC3634187          DOI: 10.1093/bioinformatics/btt110

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  21 in total

1.  Learning to predict protein-protein interactions from protein sequences.

Authors:  Shawn M Gomez; William Stafford Noble; Andrey Rzhetsky
Journal:  Bioinformatics       Date:  2003-10-12       Impact factor: 6.937

2.  Random forest similarity for protein-protein interaction prediction from multiple sources.

Authors:  Yanjun Qi; Judith Klein-Seetharaman; Ziv Bar-Joseph
Journal:  Pac Symp Biocomput       Date:  2005

3.  Fast protein classification with multiple networks.

Authors:  Koji Tsuda; HyunJung Shin; Bernhard Schölkopf
Journal:  Bioinformatics       Date:  2005-09-01       Impact factor: 6.937

4.  Diffusion kernel-based logistic regression models for protein function prediction.

Authors:  Hyunju Lee; Zhidong Tu; Minghua Deng; Fengzhu Sun; Ting Chen
Journal:  OMICS       Date:  2006

5.  An integrated probabilistic approach for gene function prediction using multiple sources of high-throughput data.

Authors:  Chao Zhang; Trupti Joshi; Guan Ning Lin; Dong Xu
Journal:  Int J Comput Biol Drug Des       Date:  2008

6.  The impact of incomplete knowledge on evaluation: an experimental benchmark for protein function prediction.

Authors:  Curtis Huttenhower; Matthew A Hibbs; Chad L Myers; Amy A Caudy; David C Hess; Olga G Troyanskaya
Journal:  Bioinformatics       Date:  2009-06-26       Impact factor: 6.937

7.  Cytoscape 2.8: new features for data integration and network visualization.

Authors:  Michael E Smoot; Keiichiro Ono; Johannes Ruscheinski; Peng-Liang Wang; Trey Ideker
Journal:  Bioinformatics       Date:  2010-12-12       Impact factor: 6.937

8.  BioGRID: a general repository for interaction datasets.

Authors:  Chris Stark; Bobby-Joe Breitkreutz; Teresa Reguly; Lorrie Boucher; Ashton Breitkreutz; Mike Tyers
Journal:  Nucleic Acids Res       Date:  2006-01-01       Impact factor: 16.971

9.  A Bayesian framework for combining heterogeneous data sources for gene function prediction (in Saccharomyces cerevisiae).

Authors:  Olga G Troyanskaya; Kara Dolinski; Art B Owen; Russ B Altman; David Botstein
Journal:  Proc Natl Acad Sci U S A       Date:  2003-06-25       Impact factor: 12.779

10.  A critical assessment of Mus musculus gene function prediction using integrated genomic evidence.

Authors:  Lourdes Peña-Castillo; Murat Tasan; Chad L Myers; Hyunju Lee; Trupti Joshi; Chao Zhang; Yuanfang Guan; Michele Leone; Andrea Pagnani; Wan Kyu Kim; Chase Krumpelman; Weidong Tian; Guillaume Obozinski; Yanjun Qi; Sara Mostafavi; Guan Ning Lin; Gabriel F Berriz; Francis D Gibbons; Gert Lanckriet; Jian Qiu; Charles Grant; Zafer Barutcuoglu; David P Hill; David Warde-Farley; Chris Grouios; Debajyoti Ray; Judith A Blake; Minghua Deng; Michael I Jordan; William S Noble; Quaid Morris; Judith Klein-Seetharaman; Ziv Bar-Joseph; Ting Chen; Fengzhu Sun; Olga G Troyanskaya; Edward M Marcotte; Dong Xu; Timothy R Hughes; Frederick P Roth
Journal:  Genome Biol       Date:  2008-06-27       Impact factor: 13.583

View more
  9 in total

1.  BUSCA: an integrative web server to predict subcellular localization of proteins.

Authors:  Castrense Savojardo; Pier Luigi Martelli; Piero Fariselli; Giuseppe Profiti; Rita Casadio
Journal:  Nucleic Acids Res       Date:  2018-07-02       Impact factor: 16.971

2.  Progress and challenges in the computational prediction of gene function using networks: 2012-2013 update.

Authors:  Paul Pavlidis; Jesse Gillis
Journal:  F1000Res       Date:  2013-10-31

3.  Negative example selection for protein function prediction: the NoGO database.

Authors:  Noah Youngs; Duncan Penfold-Brown; Richard Bonneau; Dennis Shasha
Journal:  PLoS Comput Biol       Date:  2014-06-12       Impact factor: 4.475

Review 4.  Hierarchical ensemble methods for protein function prediction.

Authors:  Giorgio Valentini
Journal:  ISRN Bioinform       Date:  2014-05-04

5.  NoGOA: predicting noisy GO annotations using evidences and sparse representation.

Authors:  Guoxian Yu; Chang Lu; Jun Wang
Journal:  BMC Bioinformatics       Date:  2017-07-21       Impact factor: 3.169

6.  Supervised learning is an accurate method for network-based gene classification.

Authors:  Renming Liu; Christopher A Mancuso; Anna Yannakopoulos; Kayla A Johnson; Arjun Krishnan
Journal:  Bioinformatics       Date:  2020-06-01       Impact factor: 6.937

7.  High-resolution functional annotation of human transcriptome: predicting isoform functions by a novel multiple instance-based label propagation method.

Authors:  Wenyuan Li; Shuli Kang; Chun-Chi Liu; Shihua Zhang; Yi Shi; Yan Liu; Xianghong Jasmine Zhou
Journal:  Nucleic Acids Res       Date:  2013-12-25       Impact factor: 16.971

8.  Evaluating the impact of topological protein features on the negative examples selection.

Authors:  Paolo Boldi; Marco Frasca; Dario Malchiodi
Journal:  BMC Bioinformatics       Date:  2018-11-20       Impact factor: 3.169

9.  deepNF: deep network fusion for protein function prediction.

Authors:  Vladimir Gligorijevic; Meet Barot; Richard Bonneau
Journal:  Bioinformatics       Date:  2018-11-15       Impact factor: 6.937

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.