| Literature DB >> 27570438 |
Adrian E Raftery1, Xiaoyue Niu1, Peter D Hoff1, Ka Yee Yeung1.
Abstract
Network models are widely used in social sciences and genome sciences. The latent space model proposed by (Hoff et al. 2002), and extended by (Handcock et al. 2007) to incorporate clustering, provides a visually interpretable model-based spatial representation of relational data and takes account of several intrinsic network properties. Due to the structure of the likelihood function of the latent space model, the computational cost is of order O(N2), where N is the number of nodes. This makes it infeasible for large networks. In this paper, we propose an approximation of the log likelihood function. We adopt the case-control idea from epidemiology and construct a case-control likelihood which is an unbiased estimator of the full likelihood. Replacing the full likelihood by the case-control likelihood in the MCMC estimation of the latent space model reduces the computational time from O(N2) to O(N), making it feasible for large networks. We evaluate its performance using simulated and real data. We fit the model to a large protein-protein interaction data using the case-control likelihood and use the model fitted link probabilities to identify false positive links.Entities:
Keywords: Markov chain Monte Carlo; clustering; genome science; graph; protein-protein interaction; social science
Year: 2012 PMID: 27570438 PMCID: PMC5001802 DOI: 10.1080/10618600.2012.679240
Source DB: PubMed Journal: J Comput Graph Stat ISSN: 1061-8600 Impact factor: 2.302