Literature DB >> 24480284

Bias tradeoffs in the creation and analysis of protein-protein interaction networks.

Jesse Gillis1, Sara Ballouz2, Paul Pavlidis3.   

Abstract

Networks constructed from aggregated protein-protein interaction data are commonplace in biology. But the studies these data are derived from were conducted with their own hypotheses and foci. Focusing on data from budding yeast present in BioGRID, we determine that many of the downstream signals present in network data are significantly impacted by biases in the original data. We determine the degree to which selection bias in favor of biologically interesting bait proteins goes down with study size, while we also find that promiscuity in prey contributes more substantially in larger studies. We analyze interaction studies over time with respect to data in the Gene Ontology and find that reproducibly observed interactions are less likely to favor multifunctional proteins. We find that strong alignment between co-expression and protein-protein interaction data occurs only for extreme co-expression values, and use this data to suggest candidates for targets likely to reveal novel biology in follow-up studies. BIOLOGICAL SIGNIFICANCE: Protein-protein interaction data finds particularly heavy use in the interpretation of disease-causal variants. In principle, network data allows researchers to find novel commonalities among candidate genes. In this study, we detail several of the most salient biases contributing to aggregated protein-protein interaction databases. We find strong evidence for the role of selection and laboratory biases. Many of these effects contribute to the commonalities researchers find for disease genes. In order for characterization of disease genes and their interactions to not simply be an artifact of researcher preference, it is imperative to identify data biases explicitly. Based on this, we also suggest ways to move forward in producing candidates less influenced by prior knowledge. This article is part of a Special Issue entitled: Can Proteomics Fill the Gap Between Genomics and Phenotypes?
Copyright © 2014 Elsevier B.V. All rights reserved.

Entities:  

Keywords:  Bias; Co-expression; Gene Ontology; Multifunctionality; Networks; Protein–protein interaction

Mesh:

Substances:

Year:  2014        PMID: 24480284      PMCID: PMC3972268          DOI: 10.1016/j.jprot.2014.01.020

Source DB:  PubMed          Journal:  J Proteomics        ISSN: 1874-3919            Impact factor:   4.044


  32 in total

1.  Gene ontology: tool for the unification of biology. The Gene Ontology Consortium.

Authors:  M Ashburner; C A Ball; J A Blake; D Botstein; H Butler; J M Cherry; A P Davis; K Dolinski; S S Dwight; J T Eppig; M A Harris; D P Hill; L Issel-Tarver; A Kasarskis; S Lewis; J C Matese; J E Richardson; M Ringwald; G M Rubin; G Sherlock
Journal:  Nat Genet       Date:  2000-05       Impact factor: 38.330

2.  Unraveling the conundrum of seemingly discordant protein-protein interaction datasets.

Authors:  Shobhit Gupta; Anders Wallqvist; Rajkumar Bondugula; Joseph Ivanic; Jaques Reifman
Journal:  Annu Int Conf IEEE Eng Med Biol Soc       Date:  2010

3.  Linear models and empirical bayes methods for assessing differential expression in microarray experiments.

Authors:  Gordon K Smyth
Journal:  Stat Appl Genet Mol Biol       Date:  2004-02-12

4.  Direct interaction between hnRNP-M and CDC5L/PLRG1 proteins affects alternative splice site choice.

Authors:  David Llères; Marco Denegri; Marco Biggiogera; Paul Ajuh; Angus I Lamond
Journal:  EMBO Rep       Date:  2010-05-14       Impact factor: 8.807

5.  Functional and topological characterization of protein interaction networks.

Authors:  Soon-Hyung Yook; Zoltán N Oltvai; Albert-László Barabási
Journal:  Proteomics       Date:  2004-04       Impact factor: 3.984

6.  The impact of multifunctional genes on "guilt by association" analysis.

Authors:  Jesse Gillis; Paul Pavlidis
Journal:  PLoS One       Date:  2011-02-18       Impact factor: 3.240

7.  "Guilt by association" is the exception rather than the rule in gene networks.

Authors:  Jesse Gillis; Paul Pavlidis
Journal:  PLoS Comput Biol       Date:  2012-03-29       Impact factor: 4.475

8.  An improved, bias-reduced probabilistic functional gene network of baker's yeast, Saccharomyces cerevisiae.

Authors:  Insuk Lee; Zhihua Li; Edward M Marcotte
Journal:  PLoS One       Date:  2007-10-03       Impact factor: 3.240

9.  Evidence of probabilistic behaviour in protein interaction networks.

Authors:  Joseph Ivanic; Anders Wallqvist; Jaques Reifman
Journal:  BMC Syst Biol       Date:  2008-01-31

10.  Inferring mouse gene functions from genomic-scale data using a combined functional network/classification strategy.

Authors:  Wan Kyu Kim; Chase Krumpelman; Edward M Marcotte
Journal:  Genome Biol       Date:  2008-06-27       Impact factor: 13.583

View more
  27 in total

Review 1.  Biomechanisms of Comorbidity: Reviewing Integrative Analyses of Multi-omics Datasets and Electronic Health Records.

Authors:  N Pouladi; I Achour; H Li; J Berghout; C Kenost; M L Gonzalez-Garay; Y A Lussier
Journal:  Yearb Med Inform       Date:  2016-11-10

2.  Understanding allergic multimorbidity within the non-eosinophilic interactome.

Authors:  Daniel Aguilar; Nathanael Lemonnier; Gerard H Koppelman; Erik Melén; Baldo Oliva; Mariona Pinart; Stefano Guerra; Jean Bousquet; Josep M Anto
Journal:  PLoS One       Date:  2019-11-06       Impact factor: 3.240

3.  Analytical Guidelines for co-fractionation Mass Spectrometry Obtained through Global Profiling of Gold Standard Saccharomyces cerevisiae Protein Complexes.

Authors:  Chi Nam Ignatius Pang; Sara Ballouz; Daniel Weissberger; Loïc M Thibaut; Joshua J Hamey; Jesse Gillis; Marc R Wilkins; Gene Hart-Smith
Journal:  Mol Cell Proteomics       Date:  2020-08-18       Impact factor: 5.911

4.  Haploinsufficiency predictions without study bias.

Authors:  Julia Steinberg; Frantisek Honti; Stephen Meader; Caleb Webber
Journal:  Nucleic Acids Res       Date:  2015-05-22       Impact factor: 16.971

5.  The path of no return--Truncated protein N-termini and current ignorance of their genesis.

Authors:  Nikolaus Fortelny; Paul Pavlidis; Christopher M Overall
Journal:  Proteomics       Date:  2015-06-15       Impact factor: 3.984

6.  Disrupted cooperation between transcription factors across diverse cancer types.

Authors:  Jing Wang; Qi Liu; Jingchun Sun; Yu Shyr
Journal:  BMC Genomics       Date:  2016-08-05       Impact factor: 3.969

7.  Protein complex prediction for large protein protein interaction networks with the Core&Peel method.

Authors:  Marco Pellegrini; Miriam Baglioni; Filippo Geraci
Journal:  BMC Bioinformatics       Date:  2016-11-08       Impact factor: 3.169

8.  Heterogeneous Network Edge Prediction: A Data Integration Approach to Prioritize Disease-Associated Genes.

Authors:  Daniel S Himmelstein; Sergio E Baranzini
Journal:  PLoS Comput Biol       Date:  2015-07-09       Impact factor: 4.475

9.  Correcting for the study bias associated with protein-protein interaction measurements reveals differences between protein degree distributions from different cancer types.

Authors:  Martin H Schaefer; Luis Serrano; Miguel A Andrade-Navarro
Journal:  Front Genet       Date:  2015-08-04       Impact factor: 4.599

10.  Ligand Similarity Complements Sequence, Physical Interaction, and Co-Expression for Gene Function Prediction.

Authors:  Matthew J O'Meara; Sara Ballouz; Brian K Shoichet; Jesse Gillis
Journal:  PLoS One       Date:  2016-07-28       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.