Literature DB >> 15759657

Random forest similarity for protein-protein interaction prediction from multiple sources.

Yanjun Qi1, Judith Klein-Seetharaman, Ziv Bar-Joseph.   

Abstract

One of the most important, but often ignored, parts of any clustering and classification algorithm is the computation of the similarity matrix. This is especially important when integrating high throughput biological data sources because of the high noise rates and the many missing values. In this paper we present a new method to compute such similarities for the task of classifying pairs of proteins as interacting or not. Our method uses direct and indirect information about interaction pairs to constructs a random forest (a collection of decision tress) from a training set. The resulting forest is used to determine the similarity between protein pairs and this similarity is used by a classification algorithm (a modified kNN) to classify protein pairs. Testing the algorithm on yeast data indicates that it is able to improve coverage to 20% of interacting pairs with a false positive rate of 50%. These results compare favorably with all previously suggested methods for this task indicating the importance of robust similarity estimates.

Entities:  

Mesh:

Substances:

Year:  2005        PMID: 15759657

Source DB:  PubMed          Journal:  Pac Symp Biocomput        ISSN: 2335-6928


  46 in total

1.  Evaluation of different biological data and computational classification methods for use in protein interaction prediction.

Authors:  Yanjun Qi; Ziv Bar-Joseph; Judith Klein-Seetharaman
Journal:  Proteins       Date:  2006-05-15

2.  LTHREADER: prediction of extracellular ligand-receptor interactions in cytokines using localized threading.

Authors:  Vinay Pulim; Jadwiga Bienkowska; Bonnie Berger
Journal:  Protein Sci       Date:  2007-12-20       Impact factor: 6.725

3.  Using random forest for reliable classification and cost-sensitive learning for medical diagnosis.

Authors:  Fan Yang; Hua-zhen Wang; Hong Mi; Cheng-de Lin; Wei-wen Cai
Journal:  BMC Bioinformatics       Date:  2009-01-30       Impact factor: 3.169

4.  Role of bacterial peptidase F inferred by statistical analysis and further experimental validation.

Authors:  Liliana Lopez Kleine; Véronique Monnet; Christine Pechoux; Alain Trubuil
Journal:  HFSP J       Date:  2008-01-07

5.  A Comparison Study of Algorithms to Detect Drug-Adverse Event Associations: Frequentist, Bayesian, and Machine-Learning Approaches.

Authors:  Minh Pham; Feng Cheng; Kandethody Ramachandran
Journal:  Drug Saf       Date:  2019-06       Impact factor: 5.606

6.  Parametric Bayesian priors and better choice of negative examples improve protein function prediction.

Authors:  Noah Youngs; Duncan Penfold-Brown; Kevin Drew; Dennis Shasha; Richard Bonneau
Journal:  Bioinformatics       Date:  2013-03-19       Impact factor: 6.937

Review 7.  stepwiseCM: An R Package for Stepwise Classification of Cancer Samples Using Multiple Heterogeneous Data Sets.

Authors:  Askar Obulkasim; Mark A van de Wiel
Journal:  Cancer Inform       Date:  2014-01-02

8.  Large-scale prediction of protein-protein interactions from structures.

Authors:  Martial Hue; Michael Riffle; Jean-Philippe Vert; William S Noble
Journal:  BMC Bioinformatics       Date:  2010-03-18       Impact factor: 3.169

9.  Semi-supervised multi-task learning for predicting interactions between HIV-1 and human proteins.

Authors:  Yanjun Qi; Oznur Tastan; Jaime G Carbonell; Judith Klein-Seetharaman; Jason Weston
Journal:  Bioinformatics       Date:  2010-09-15       Impact factor: 6.937

10.  GAIA: a gram-based interaction analysis tool--an approach for identifying interacting domains in yeast.

Authors:  Kelvin X Zhang; B F Francis Ouellette
Journal:  BMC Bioinformatics       Date:  2009-01-30       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.