| Literature DB >> 21461334 |
Ardo van den Hout, Ulf Böckenholt, Peter G M van der Heijden.
Abstract
Randomized response is a misclassification design to estimate the prevalence of sensitive behaviour. Respondents who do not follow the instructions of the design are considered to be cheating. A mixture model is proposed to estimate the prevalence of sensitive behaviour and cheating in the case of a dual sampling scheme with direct questioning and randomized response. The mixing weight is the probability of cheating, where cheating is modelled separately for direct questioning and randomized response. For Bayesian inference, Markov chain Monte Carlo sampling is applied to sample parameter values from the posterior. The model makes it possible to analyse dual sample scheme data in a unified way and to assess cheating for direct questions as well as for randomized response questions. The research is illustrated with randomized response data concerning violations of regulations for social benefit.Entities:
Year: 2010 PMID: 21461334 PMCID: PMC3065643 DOI: 10.1111/j.1467-9876.2010.00720.x
Source DB: PubMed Journal: J R Stat Soc Ser C Appl Stat ISSN: 0035-9254 Impact factor: 1.864
Distribution of the answers to the two statements in the RR design and in the DQ design
| Design | Results for statement 1: Reason | Results for statement 2: Advantage | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| −2 | −1 | 0 | 1 | 2 | −2 | −1 | 0 | 1 | 2 | |
| RR | 0.18 | 0.46 | 0.25 | 0.08 | 0.03 | 0.18 | 0.35 | 0.25 | 0.16 | 0.06 |
| DQ | 0.12 | 0.44 | 0.24 | 0.15 | 0.05 | 0.16 | 0.36 | 0.31 | 0.12 | 0.05 |
Information criteria for the maximum likelihood estimation: minus two times the log-likelihood −2LL and the BIC
| Model | Number of parameters | −2LL | BIC |
|---|---|---|---|
| I | 9 | 5872.2 | 5941.6 |
| II | 12 | 5824.8 | 5917.3 |
| III | 17 | 5810.0 | 5941.1 |
Bayesian inference for models without covariates (model I) and with covariates (model II)
| Parameter | Results for model I | Results for model II |
|---|---|---|
| Posterior mean (95% CI) | Posterior mean (95% CI) | |
| π111 | 0.017 (0.006, 0.029) | 0.017 (0.006, 0.029) |
| π112 | 0.002 (0.0001, 0.008) | 0.002 (0.0001, 0.008) |
| π121 | 0.058 (0.040, 0.078) | 0.059 (0.040, 0.079) |
| π122 | 0.115 (0.079, 0.157) | 0.120 (0.086, 0.158) |
| π211 | 0.006 (0.001, 0.015) | 0.006 (0.001, 0.019) |
| π212 | 0.006 (0.001, 0.018) | 0.007 (0.001, 0.019) |
| π221 | 0.019 (0.005, 0.043) | 0.021 (0.006, 0.044) |
| π222 | 0.776 (0.710, 0.831) | 0.768 (0.707, 0.819) |
| τ1 | 0.157 (0.097, 0.218) | |
| τ2 | 0.536 (0.330, 0.694) | |
| β0.1 (intercept) | −1.899 (−2.787,−1.271) | |
| βse.1, Sex | −0.968 (−1.976,−0.276) | |
| βad.1, Advantage | −0.845 (−1.358,−0.469) | |
| β0.2 (intercept) | −0.292 (−1.480,0.531) | |
| βad.2, Advantage | −1.370 (−2.220,−0.727) |
Fig. 1Posterior densities of prevalence parameters estimated by using the model with the covariates