Literature DB >> 16584317

Diffusion kernel-based logistic regression models for protein function prediction.

Hyunju Lee1, Zhidong Tu, Minghua Deng, Fengzhu Sun, Ting Chen.   

Abstract

Assigning functions to unknown proteins is one of the most important problems in proteomics. Several approaches have used protein-protein interaction data to predict protein functions. We previously developed a Markov random field (MRF) based method to infer a protein's functions using protein-protein interaction data and the functional annotations of its protein interaction partners. In the original model, only direct interactions were considered and each function was considered separately. In this study, we develop a new model which extends direct interactions to all neighboring proteins, and one function to multiple functions. The goal is to understand a protein's function based on information on all the neighboring proteins in the interaction network. We first developed a novel kernel logistic regression (KLR) method based on diffusion kernels for protein interaction networks. The diffusion kernels provide means to incorporate all neighbors of proteins in the network. Second, we identified a set of functions that are highly correlated with the function of interest, referred to as the correlated functions, using the chi-square test. Third, the correlated functions were incorporated into our new KLR model. Fourth, we extended our model by incorporating multiple biological data sources such as protein domains, protein complexes, and gene expressions by converting them into networks. We showed that the KLR approach of incorporating all protein neighbors significantly improved the accuracy of protein function predictions over the MRF model. The incorporation of multiple data sets also improved prediction accuracy. The prediction accuracy is comparable to another protein function classifier based on the support vector machine (SVM), using a diffusion kernel. The advantages of the KLR model include its simplicity as well as its ability to explore the contribution of neighbors to the functions of proteins of interest.

Mesh:

Substances:

Year:  2006        PMID: 16584317     DOI: 10.1089/omi.2006.10.40

Source DB:  PubMed          Journal:  OMICS        ISSN: 1536-2310


  32 in total

1.  Structure-based prediction of DNA-binding proteins by structural alignment and a volume-fraction corrected DFIRE-based energy function.

Authors:  Huiying Zhao; Yuedong Yang; Yaoqi Zhou
Journal:  Bioinformatics       Date:  2010-06-04       Impact factor: 6.937

2.  Integrative approaches for predicting protein function and prioritizing genes for complex phenotypes using protein interaction networks.

Authors:  Xiaotu Ma; Ting Chen; Fengzhu Sun
Journal:  Brief Bioinform       Date:  2013-06-19       Impact factor: 11.622

Review 3.  Protein networks in disease.

Authors:  Trey Ideker; Roded Sharan
Journal:  Genome Res       Date:  2008-04       Impact factor: 9.043

Review 4.  Network propagation: a universal amplifier of genetic associations.

Authors:  Lenore Cowen; Trey Ideker; Benjamin J Raphael; Roded Sharan
Journal:  Nat Rev Genet       Date:  2017-06-12       Impact factor: 53.242

5.  Parametric Bayesian priors and better choice of negative examples improve protein function prediction.

Authors:  Noah Youngs; Duncan Penfold-Brown; Kevin Drew; Dennis Shasha; Richard Bonneau
Journal:  Bioinformatics       Date:  2013-03-19       Impact factor: 6.937

6.  Predicting eukaryotic transcriptional cooperativity by Bayesian network integration of genome-wide data.

Authors:  Yong Wang; Xiang-Sun Zhang; Yu Xia
Journal:  Nucleic Acids Res       Date:  2009-08-06       Impact factor: 16.971

7.  Bayesian Markov Random Field analysis for protein function prediction based on network data.

Authors:  Yiannis A I Kourmpetis; Aalt D J van Dijk; Marco C A M Bink; Roeland C H J van Ham; Cajo J F ter Braak
Journal:  PLoS One       Date:  2010-02-24       Impact factor: 3.240

8.  Detecting disease associated modules and prioritizing active genes based on high throughput data.

Authors:  Yu-Qing Qiu; Shihua Zhang; Xiang-Sun Zhang; Luonan Chen
Journal:  BMC Bioinformatics       Date:  2010-01-13       Impact factor: 3.169

9.  Predicting gene function using hierarchical multi-label decision tree ensembles.

Authors:  Leander Schietgat; Celine Vens; Jan Struyf; Hendrik Blockeel; Dragi Kocev; Saso Dzeroski
Journal:  BMC Bioinformatics       Date:  2010-01-02       Impact factor: 3.169

10.  Integrative approaches to the prediction of protein functions based on the feature selection.

Authors:  Seokha Ko; Hyunju Lee
Journal:  BMC Bioinformatics       Date:  2009-12-31       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.