| Literature DB >> 21905065 |
Dan Jackson1, Ian R White, James Carpenter.
Abstract
In statistical modelling, it is often important to know how much parameter estimates are influenced by particular observations. An attractive approach is to re-estimate the parameters with each observation deleted in turn, but this is computationally demanding when fitting models by using Markov chain Monte Carlo (MCMC), as obtaining complete sample estimates is often in itself a very time-consuming task. Here we propose two efficient ways to approximate the case-deleted estimates by using output from MCMC estimation. Our first proposal, which directly approximates the usual influence statistics in maximum likelihood analyses of generalised linear models (GLMs), is easy to implement and avoids any further evaluation of the likelihood. Hence, unlike the existing alternatives, it does not become more computationally intensive as the model complexity increases. Our second proposal, which utilises model perturbations, also has this advantage and does not require the form of the GLM to be specified. We show how our two proposed methods are related and evaluate them against the existing method of importance sampling and case deletion in a logistic regression analysis with missing covariates. We also provide practical advice for those implementing our procedures, so that they may be used in many situations where MCMC is used to fit statistical models.Entities:
Mesh:
Year: 2011 PMID: 21905065 PMCID: PMC3500673 DOI: 10.1002/sim.4356
Source DB: PubMed Journal: Stat Med ISSN: 0277-6715 Impact factor: 2.373
Some results from the simulation study: c1 and m1 denote the intercepts and gradients of the least squares regression lines of influence statistics from our first proposal (Section 3.1) on the frequentist influence statistics; ρ1 denotes the correlation between these influence statistics. c2, m2 and ρ2 denote these same quantities for influences obtained using perturbations
| Parameter | ||||||
|---|---|---|---|---|---|---|
| 0.0000 | 0.9832 | 0.99979 | 0.0002 | 0.9761 | 0.99978 | |
| 0.0000 | 0.9648 | 0.99964 | 0.0000 | 0.9578 | 0.99963 | |
| 0.0000 | 0.9726 | 0.99962 | −0.0001 | 0.9655 | 0.99957 |
Figure 1Standardised influence statistics for the first 20 observations. The top two panels show influences for β2 and β3, whilst the bottom two panels show these for β4 and β5. Solid points show the influence statistics obtained by case deletion; hollow circles show the corresponding estimates obtained by importance sampling; triangles show these using our first proposal (Section 3.1); and diamonds show influences obtained using perturbations.