Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Random forest similarity for protein-protein interaction prediction from multiple sources.

Literature DB >> 15759657

Random forest similarity for protein-protein interaction prediction from multiple sources.

Yanjun Qi¹, Judith Klein-Seetharaman, Ziv Bar-Joseph.

Abstract

One of the most important, but often ignored, parts of any clustering and classification algorithm is the computation of the similarity matrix. This is especially important when integrating high throughput biological data sources because of the high noise rates and the many missing values. In this paper we present a new method to compute such similarities for the task of classifying pairs of proteins as interacting or not. Our method uses direct and indirect information about interaction pairs to constructs a random forest (a collection of decision tress) from a training set. The resulting forest is used to determine the similarity between protein pairs and this similarity is used by a classification algorithm (a modified kNN) to classify protein pairs. Testing the algorithm on yeast data indicates that it is able to improve coverage to 20% of interacting pairs with a false positive rate of 50%. These results compare favorably with all previously suggested methods for this task indicating the importance of robust similarity estimates.

Entities: Species

Mesh：

Substances：
Proteins

Year: 2005 PMID： 15759657

Source DB: PubMed Journal: Pac Symp Biocomput ISSN： 2335-6928

Keyword Cloud
Cited

46 in total

1. Evaluation of different biological data and computational classification methods for use in protein interaction prediction.

Authors: Yanjun Qi; Ziv Bar-Joseph; Judith Klein-Seetharaman
Journal: Proteins Date: 2006-05-15

2. LTHREADER: prediction of extracellular ligand-receptor interactions in cytokines using localized threading.

Authors: Vinay Pulim; Jadwiga Bienkowska; Bonnie Berger
Journal: Protein Sci Date: 2007-12-20 Impact factor: 6.725

3. Using random forest for reliable classification and cost-sensitive learning for medical diagnosis.

Authors: Fan Yang; Hua-zhen Wang; Hong Mi; Cheng-de Lin; Wei-wen Cai
Journal: BMC Bioinformatics Date: 2009-01-30 Impact factor: 3.169

4. Role of bacterial peptidase F inferred by statistical analysis and further experimental validation.

Authors: Liliana Lopez Kleine; Véronique Monnet; Christine Pechoux; Alain Trubuil
Journal: HFSP J Date: 2008-01-07

5. A Comparison Study of Algorithms to Detect Drug-Adverse Event Associations: Frequentist, Bayesian, and Machine-Learning Approaches.

Authors: Minh Pham; Feng Cheng; Kandethody Ramachandran
Journal: Drug Saf Date: 2019-06 Impact factor: 5.606

Random forest similarity for protein-protein interaction prediction from multiple sources.

1. Evaluation of different biological data and computational classification methods for use in protein interaction prediction.

2. LTHREADER: prediction of extracellular ligand-receptor interactions in cytokines using localized threading.

3. Using random forest for reliable classification and cost-sensitive learning for medical diagnosis.

4. Role of bacterial peptidase F inferred by statistical analysis and further experimental validation.

5. A Comparison Study of Algorithms to Detect Drug-Adverse Event Associations: Frequentist, Bayesian, and Machine-Learning Approaches.

6. Parametric Bayesian priors and better choice of negative examples improve protein function prediction.

Review 7. stepwiseCM: An R Package for Stepwise Classification of Cancer Samples Using Multiple Heterogeneous Data Sets.

8. Large-scale prediction of protein-protein interactions from structures.

9. Semi-supervised multi-task learning for predicting interactions between HIV-1 and human proteins.

10. GAIA: a gram-based interaction analysis tool--an approach for identifying interacting domains in yeast.