Literature DB >> 35724564

PCfun: a hybrid computational framework for systematic characterization of protein complex function.

Varun S Sharma1,2, Andrea Fossati3,4, Rodolfo Ciuffa1, Marija Buljan5,6, Evan G Williams7, Zhen Chen8, Wenguang Shao1, Patrick G A Pedrioli1, Anthony W Purcell9, María Rodríguez Martínez10, Jiangning Song9,11, Matteo Manica10, Ruedi Aebersold1,12, Chen Li1,9.   

Abstract

In molecular biology, it is a general assumption that the ensemble of expressed molecules, their activities and interactions determine biological function, cellular states and phenotypes. Stable protein complexes-or macromolecular machines-are, in turn, the key functional entities mediating and modulating most biological processes. Although identifying protein complexes and their subunit composition can now be done inexpensively and at scale, determining their function remains challenging and labor intensive. This study describes Protein Complex Function predictor (PCfun), the first computational framework for the systematic annotation of protein complex functions using Gene Ontology (GO) terms. PCfun is built upon a word embedding using natural language processing techniques based on 1 million open access PubMed Central articles. Specifically, PCfun leverages two approaches for accurately identifying protein complex function, including: (i) an unsupervised approach that obtains the nearest neighbor (NN) GO term word vectors for a protein complex query vector and (ii) a supervised approach using Random Forest (RF) models trained specifically for recovering the GO terms of protein complex queries described in the CORUM protein complex database. PCfun consolidates both approaches by performing a hypergeometric statistical test to enrich the top NN GO terms within the child terms of the GO terms predicted by the RF models. The documentation and implementation of the PCfun package are available at https://github.com/sharmavaruns/PCfun. We anticipate that PCfun will serve as a useful tool and novel paradigm for the large-scale characterization of protein complex function.
© The Author(s) 2022. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

Entities:  

Keywords:  gene ontology; machine learning; natural language processing; protein complex function

Mesh:

Substances:

Year:  2022        PMID: 35724564      PMCID: PMC9310514          DOI: 10.1093/bib/bbac239

Source DB:  PubMed          Journal:  Brief Bioinform        ISSN: 1467-5463            Impact factor:   13.994


  60 in total

Review 1.  Protein complexes involved in heptahelical receptor-mediated signal transduction.

Authors:  R Victor Rebois; Terence E Hébert
Journal:  Receptors Channels       Date:  2003

Review 2.  Smad transcription factors.

Authors:  Joan Massagué; Joan Seoane; David Wotton
Journal:  Genes Dev       Date:  2005-12-01       Impact factor: 11.361

3.  Transcriptional regulation of protein complexes within and across species.

Authors:  Kai Tan; Tomer Shlomi; Hoda Feizi; Trey Ideker; Roded Sharan
Journal:  Proc Natl Acad Sci U S A       Date:  2007-01-16       Impact factor: 11.205

Review 4.  Analysis of protein complexes using mass spectrometry.

Authors:  Anne-Claude Gingras; Matthias Gstaiger; Brian Raught; Ruedi Aebersold
Journal:  Nat Rev Mol Cell Biol       Date:  2007-08       Impact factor: 94.444

Review 5.  Mass-spectrometric exploration of proteome structure and function.

Authors:  Ruedi Aebersold; Matthias Mann
Journal:  Nature       Date:  2016-09-15       Impact factor: 49.962

6.  A graph regularized generalized matrix factorization model for predicting links in biomedical bipartite networks.

Authors:  Zi-Chao Zhang; Xiao-Fei Zhang; Min Wu; Le Ou-Yang; Xing-Ming Zhao; Xiao-Li Li
Journal:  Bioinformatics       Date:  2020-06-01       Impact factor: 6.937

7.  Predicting human microbe-drug associations via graph convolutional network with conditional random field.

Authors:  Yahui Long; Min Wu; Chee Keong Kwoh; Jiawei Luo; Xiaoli Li
Journal:  Bioinformatics       Date:  2020-12-08       Impact factor: 6.937

8.  A Global Screen for Assembly State Changes of the Mitotic Proteome by SEC-SWATH-MS.

Authors:  Moritz Heusel; Max Frank; Mario Köhler; Sabine Amon; Fabian Frommelt; George Rosenberger; Isabell Bludau; Simran Aulakh; Monika I Linder; Yansheng Liu; Ben C Collins; Matthias Gstaiger; Ulrike Kutay; Ruedi Aebersold
Journal:  Cell Syst       Date:  2020-02-05       Impact factor: 10.304

9.  Large-scale investigation of the reasons why potentially important genes are ignored.

Authors:  Thomas Stoeger; Martin Gerlach; Richard I Morimoto; Luís A Nunes Amaral
Journal:  PLoS Biol       Date:  2018-09-18       Impact factor: 8.029

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.