| Literature DB >> 29027305 |
Stéphanie Baggio1, Katia Iglesias2, Valentin Rousson3.
Abstract
Analyzing count data is frequent in addiction studies but may be cumbersome, time-consuming, and cause misleading inference if models are not correctly specified. We compared different statistical models in a simulation study to provide simple, yet valid, recommendations when analyzing count data.We used 2 simulation studies to test the performance of 7 statistical models (classical or quasi-Poisson regression, classical or zero-inflated negative binomial regression, classical or heteroskedasticity-consistent linear regression, and Mann-Whitney test) for predicting the differences between population means for 9 different population distributions (Poisson, negative binomial, zero- and one-inflated Poisson and negative binomial, uniform, left-skewed, and bimodal). We considered a large number of scenarios likely to occur in addiction research: presence of outliers, unbalanced design, and the presence of confounding factors. In unadjusted models, the Mann-Whitney test was the best model, followed closely by the heteroskedasticity-consistent linear regression and quasi-Poisson regression. Poisson regression was by far the worst model. In adjusted models, quasi-Poisson regression was the best model. If the goal is to compare 2 groups with respect to count data, a simple recommendation would be to use quasi-Poisson regression, which was the most generally valid model in our extensive simulations.Keywords: coverage of confidence interval; guidelines; simulation; substance use; type 1 error
Mesh:
Year: 2017 PMID: 29027305 PMCID: PMC6877188 DOI: 10.1002/mpr.1585
Source DB: PubMed Journal: Int J Methods Psychiatr Res ISSN: 1049-8931 Impact factor: 4.035