Literature DB >> 16204126

Fast protein classification with multiple networks.

Koji Tsuda1, HyunJung Shin, Bernhard Schölkopf.   

Abstract

MOTIVATION: Support vector machines (SVMs) have been successfully used to classify proteins into functional categories. Recently, to integrate multiple data sources, a semidefinite programming (SDP) based SVM method was introduced. In SDP/SVM, multiple kernel matrices corresponding to each of data sources are combined with weights obtained by solving an SDP. However, when trying to apply SDP/SVM to large problems, the computational cost can become prohibitive, since both converting the data to a kernel matrix for the SVM and solving the SDP are time and memory demanding. Another application-specific drawback arises when some of the data sources are protein networks. A common method of converting the network to a kernel matrix is the diffusion kernel method, which has time complexity of O(n(3)), and produces a dense matrix of size n x n.
RESULTS: We propose an efficient method of protein classification using multiple protein networks. Available protein networks, such as a physical interaction network or a metabolic network, can be directly incorporated. Vectorial data can also be incorporated after conversion into a network by means of neighbor point connection. Similar to the SDP/SVM method, the combination weights are obtained by convex optimization. Due to the sparsity of network edges, the computation time is nearly linear in the number of edges of the combined network. Additionally, the combination weights provide information useful for discarding noisy or irrelevant networks. Experiments on function prediction of 3588 yeast proteins show promising results: the computation time is enormously reduced, while the accuracy is still comparable to the SDP/SVM method. AVAILABILITY: Software and data will be available on request.

Entities:  

Mesh:

Substances:

Year:  2005        PMID: 16204126     DOI: 10.1093/bioinformatics/bti1110

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  44 in total

1.  The role of indirect connections in gene networks in predicting function.

Authors:  Jesse Gillis; Paul Pavlidis
Journal:  Bioinformatics       Date:  2011-05-06       Impact factor: 6.937

2.  Incorporating inter-relationships between different levels of genomic data into cancer clinical outcome prediction.

Authors:  Dokyoon Kim; Hyunjung Shin; Kyung-Ah Sohn; Anurag Verma; Marylyn D Ritchie; Ju Han Kim
Journal:  Methods       Date:  2014-02-18       Impact factor: 3.608

Review 3.  Methods of integrating data to uncover genotype-phenotype interactions.

Authors:  Marylyn D Ritchie; Emily R Holzinger; Ruowang Li; Sarah A Pendergrass; Dokyoon Kim
Journal:  Nat Rev Genet       Date:  2015-01-13       Impact factor: 53.242

Review 4.  Network propagation: a universal amplifier of genetic associations.

Authors:  Lenore Cowen; Trey Ideker; Benjamin J Raphael; Roded Sharan
Journal:  Nat Rev Genet       Date:  2017-06-12       Impact factor: 53.242

5.  Parametric Bayesian priors and better choice of negative examples improve protein function prediction.

Authors:  Noah Youngs; Duncan Penfold-Brown; Kevin Drew; Dennis Shasha; Richard Bonneau
Journal:  Bioinformatics       Date:  2013-03-19       Impact factor: 6.937

6.  A multi-omics data simulator for complex disease studies and its application to evaluate multi-omics data analysis methods for disease classification.

Authors:  Ren-Hua Chung; Chen-Yu Kang
Journal:  Gigascience       Date:  2019-05-01       Impact factor: 6.524

7.  diffuStats: an R package to compute diffusion-based scores on biological networks.

Authors:  Sergio Picart-Armada; Wesley K Thompson; Alfonso Buil; Alexandre Perera-Lluna
Journal:  Bioinformatics       Date:  2018-02-01       Impact factor: 6.937

Review 8.  Protein function prediction: towards integration of similarity metrics.

Authors:  Serkan Erdin; Andreas Martin Lisewski; Olivier Lichtarge
Journal:  Curr Opin Struct Biol       Date:  2011-02-24       Impact factor: 6.809

9.  Translating bioinformatics in oncology: guilt-by-profiling analysis and identification of KIF18B and CDCA3 as novel driver genes in carcinogenesis.

Authors:  Timo Itzel; Peter Scholz; Thorsten Maass; Markus Krupp; Jens U Marquardt; Susanne Strand; Diana Becker; Frank Staib; Harald Binder; Stephanie Roessler; Xin Wei Wang; Snorri Thorgeirsson; Martina Müller; Peter R Galle; Andreas Teufel
Journal:  Bioinformatics       Date:  2014-09-18       Impact factor: 6.937

10.  Fast integration of heterogeneous data sources for predicting gene function with limited annotation.

Authors:  Sara Mostafavi; Quaid Morris
Journal:  Bioinformatics       Date:  2010-05-27       Impact factor: 6.937

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.