Denise Scholtens1, Tony Chiang, Wolfgang Huber, Robert Gentleman. 1. Department of Preventive Medicine, Northwestern University Medical School, 680 N. Lake Shore Drive Suite 1102, Chicago, IL 60611-4402, USA. dscholtens@northwestern.edu
Abstract
MOTIVATION: Proteins work together to drive biological processes in cellular machines. Summarizing global and local properties of the set of protein interactions, the interactome, is necessary for describing cellular systems. We consider a relatively simple per-protein feature of the interactome: the number of interaction partners for a protein, which in graph terminology is the degree of the protein. RESULTS: Using data subject to both stochastic and systematic sources of false positive and false negative observations, we develop an explicit probability model and resultant likelihood method to estimate node degree on portions of the interactome assayed by bait-prey technologies. This approach yields substantial improvement in degree estimation over the current practice that naively sums observed edges. Accurate modeling of observed data in relation to true but unknown parameters of interest gives a formal point of reference from which to draw conclusions about the system under study. AVAILABILITY: All analyses discussed in this text can be performed using the ppiStats and ppiData packages available through the Bioconductor project (http://www.bioconductor.org).
MOTIVATION: Proteins work together to drive biological processes in cellular machines. Summarizing global and local properties of the set of protein interactions, the interactome, is necessary for describing cellular systems. We consider a relatively simple per-protein feature of the interactome: the number of interaction partners for a protein, which in graph terminology is the degree of the protein. RESULTS: Using data subject to both stochastic and systematic sources of false positive and false negative observations, we develop an explicit probability model and resultant likelihood method to estimate node degree on portions of the interactome assayed by bait-prey technologies. This approach yields substantial improvement in degree estimation over the current practice that naively sums observed edges. Accurate modeling of observed data in relation to true but unknown parameters of interest gives a formal point of reference from which to draw conclusions about the system under study. AVAILABILITY: All analyses discussed in this text can be performed using the ppiStats and ppiData packages available through the Bioconductor project (http://www.bioconductor.org).