MOTIVATION: Computational biologists have demonstrated the utility of using machine learning methods to predict protein function from an integration of multiple genome-wide data types. Yet, even the best performing function prediction algorithms rely on heuristics for important components of the algorithm, such as choosing negative examples (proteins without a given function) or determining key parameters. The improper choice of negative examples, in particular, can hamper the accuracy of protein function prediction. RESULTS: We present a novel approach for choosing negative examples, using a parameterizable Bayesian prior computed from all observed annotation data, which also generates priors used during function prediction. We incorporate this new method into the GeneMANIA function prediction algorithm and demonstrate improved accuracy of our algorithm over current top-performing function prediction methods on the yeast and mouse proteomes across all metrics tested. AVAILABILITY: Code and Data are available at: http://bonneaulab.bio.nyu.edu/funcprop.html
MOTIVATION: Computational biologists have demonstrated the utility of using machine learning methods to predict protein function from an integration of multiple genome-wide data types. Yet, even the best performing function prediction algorithms rely on heuristics for important components of the algorithm, such as choosing negative examples (proteins without a given function) or determining key parameters. The improper choice of negative examples, in particular, can hamper the accuracy of protein function prediction. RESULTS: We present a novel approach for choosing negative examples, using a parameterizable Bayesian prior computed from all observed annotation data, which also generates priors used during function prediction. We incorporate this new method into the GeneMANIA function prediction algorithm and demonstrate improved accuracy of our algorithm over current top-performing function prediction methods on the yeast and mouse proteomes across all metrics tested. AVAILABILITY: Code and Data are available at: http://bonneaulab.bio.nyu.edu/funcprop.html
Authors: Curtis Huttenhower; Matthew A Hibbs; Chad L Myers; Amy A Caudy; David C Hess; Olga G Troyanskaya Journal: Bioinformatics Date: 2009-06-26 Impact factor: 6.937
Authors: Olga G Troyanskaya; Kara Dolinski; Art B Owen; Russ B Altman; David Botstein Journal: Proc Natl Acad Sci U S A Date: 2003-06-25 Impact factor: 12.779
Authors: Lourdes Peña-Castillo; Murat Tasan; Chad L Myers; Hyunju Lee; Trupti Joshi; Chao Zhang; Yuanfang Guan; Michele Leone; Andrea Pagnani; Wan Kyu Kim; Chase Krumpelman; Weidong Tian; Guillaume Obozinski; Yanjun Qi; Sara Mostafavi; Guan Ning Lin; Gabriel F Berriz; Francis D Gibbons; Gert Lanckriet; Jian Qiu; Charles Grant; Zafer Barutcuoglu; David P Hill; David Warde-Farley; Chris Grouios; Debajyoti Ray; Judith A Blake; Minghua Deng; Michael I Jordan; William S Noble; Quaid Morris; Judith Klein-Seetharaman; Ziv Bar-Joseph; Ting Chen; Fengzhu Sun; Olga G Troyanskaya; Edward M Marcotte; Dong Xu; Timothy R Hughes; Frederick P Roth Journal: Genome Biol Date: 2008-06-27 Impact factor: 13.583
Authors: Renming Liu; Christopher A Mancuso; Anna Yannakopoulos; Kayla A Johnson; Arjun Krishnan Journal: Bioinformatics Date: 2020-06-01 Impact factor: 6.937