Literature DB >> 29912423

Models for analyzing zero-inflated and overdispersed count data: an application to cigarette and marijuana use.

Brian Pittman1, Eugenia Buta2, Suchitra Krishnan-Sarin1, Stephanie S O'Malley1, Thomas Liss1, Ralitza Gueorguieva1,2.   

Abstract

INTRODUCTION: This paper describes different methods for analyzing counts and illustrates their use on cigarette and marijuana smoking data.
METHODS: The Poisson, zero-inflated Poisson (ZIP), hurdle Poisson (HUP), negative binomial (NB), zero-inflated negative binomial (ZINB) and hurdle negative binomial (HUNB) regression models are considered. The different approaches are evaluated in terms of the ability to take into account zero-inflation (extra zeroes) and overdispersion (variance larger than expected) in count outcomes, with emphasis placed on model fit, interpretation, and choosing an appropriate model given the nature of the data. The illustrative data example focuses on cigarette and marijuana smoking reports from a study on smoking habits among youth e-cigarette users with gender, age, and e-cigarette use included as predictors.
RESULTS: Of the 69 subjects available for analysis, 36% and 64% reported smoking no cigarettes and no marijuana, respectively, suggesting both outcomes might be zero-inflated. Both outcomes were also overdispersed with large positive skew. The ZINB and HUNB models fit the cigarette counts best. According to goodness-of-fit statistics, the NB, HUNB, and ZINB models fit the marijuana data well, but the ZINB provided better interpretation.
CONCLUSION: In the absence of zero-inflation, the NB model fits smoking data well, which is typically overdispersed. In the presence of zero-inflation, the ZINB or HUNB model is recommended to account for additional heterogeneity. In addition to model fit and interpretability, choosing between a zero-inflated or hurdle model should ultimately depend on the assumptions regarding the zeros, study design, and the research question being asked. IMPLICATIONS: Count outcomes are frequent in tobacco research and often have many zeros and exhibit large variance and skew. Analyzing such data based on methods requiring a normally distributed outcome are inappropriate and will likely produce spurious results. This study compares and contrasts appropriate methods for analyzing count data, specifically those with an over-abundance of zeros, and illustrates their use on cigarette and marijuana smoking data. Recommendations are provided.

Entities:  

Year:  2018        PMID: 29912423      PMCID: PMC7364829          DOI: 10.1093/ntr/nty072

Source DB:  PubMed          Journal:  Nicotine Tob Res        ISSN: 1462-2203            Impact factor:   4.244


  23 in total

1.  On the use of zero-inflated and hurdle models for modeling vaccine adverse event count data.

Authors:  C E Rose; S W Martin; K A Wannemuehler; B D Plikaytis
Journal:  J Biopharm Stat       Date:  2006       Impact factor: 1.051

2.  Comparing statistical methods for analyzing skewed longitudinal count data with many zeros: an example of smoking cessation.

Authors:  Haiyi Xie; Jill Tao; Gregory J McHugo; Robert E Drake
Journal:  J Subst Abuse Treat       Date:  2013-02-28

3.  What is the Best Way to Analyze Less Frequent Forms of Violence? The Case of Sexual Aggression.

Authors:  Kevin M Swartout; Martie P Thompson; Mary P Koss; Nan Su
Journal:  Psychol Violence       Date:  2014-11-03

4.  GEE type inference for clustered zero-inflated negative binomial regression with application to dental caries.

Authors:  Maiying Kong; Sheng Xu; Steven M Levy; Somnath Datta
Journal:  Comput Stat Data Anal       Date:  2015-05-01       Impact factor: 1.681

5.  Modeling count data in the addiction field: Some simple recommendations.

Authors:  Stéphanie Baggio; Katia Iglesias; Valentin Rousson
Journal:  Int J Methods Psychiatr Res       Date:  2017-10-13       Impact factor: 4.035

Review 6.  Review and recommendations for zero-inflated count regression modeling of dental caries indices in epidemiological studies.

Authors:  J S Preisser; J W Stamm; D L Long; M E Kincade
Journal:  Caries Res       Date:  2012-06-15       Impact factor: 4.056

7.  Statistical Models for the Analysis of Zero-Inflated Pain Intensity Numeric Rating Scale Data.

Authors:  Joseph L Goulet; Eugenia Buta; Harini Bathulapalli; Ralitza Gueorguieva; Cynthia A Brandt
Journal:  J Pain       Date:  2016-12-02       Impact factor: 5.820

8.  The importance of distribution-choice in modeling substance use data: a comparison of negative binomial, beta binomial, and zero-inflated distributions.

Authors:  Brandie Wagner; Paula Riggs; Susan Mikulich-Gilbertson
Journal:  Am J Drug Alcohol Abuse       Date:  2015-07-08       Impact factor: 3.829

9.  High School Students' Use of Electronic Cigarettes to Vaporize Cannabis.

Authors:  Meghan E Morean; Grace Kong; Deepa R Camenga; Dana A Cavallo; Suchitra Krishnan-Sarin
Journal:  Pediatrics       Date:  2015-09-07       Impact factor: 7.124

10.  "It Looks Like an Adult Sweetie Shop": Point-of-Sale Tobacco Display Exposure and Brand Awareness in Scottish Secondary School Students.

Authors:  Winfried van der Sluijs; Farhana Haseen; Martine Miller; Andy MacGregor; Clare Sharp; Amanda Amos; Catherine Best; Martine Stead; Douglas Eadie; Jamie Pearce; John Frank; Sally Haw
Journal:  Nicotine Tob Res       Date:  2016-02-16       Impact factor: 4.244

View more
  11 in total

1.  Ambulatory quality, special health care needs, and emergency department or hospital use for US children.

Authors:  Ryan J Coller; Michelle M Kelly; Daniel J Sklansky; Kristin A Shadman; Mary L Ehlenbach; Christina B Barreda; Paul J Chung; Qianqian Zhao; Marshall Bruce Edmonson
Journal:  Health Serv Res       Date:  2020-06-27       Impact factor: 3.402

2.  Differential brain responses to alcohol-related and natural rewards are associated with alcohol use and problems: Evidence for reward dysregulation.

Authors:  Jorge S Martins; Keanan J Joyner; Denis M McCarthy; David H Morris; Christopher J Patrick; Bruce D Bartholow
Journal:  Addict Biol       Date:  2021-12-07       Impact factor: 4.280

3.  Substance Use, Gambling, Binge-Eating, and Hypersexuality Symptoms Among Patients Receiving Opioid Agonist Therapies.

Authors:  Meagan M Carr; Jennifer D Ellis; Karen K Saules; Jamie L Page; Angela Staples; David M Ledgerwood
Journal:  Am J Addict       Date:  2021-03-30

4.  Niacin, lutein and zeaxanthin and physical activity have an impact on Charlson comorbidity index using zero-inflated negative binomial regression model: National Health and Nutrition Examination Survey 2013-2014.

Authors:  Hantong Zhao; Changcong Wang; Yingan Pan; Yinpei Guo; Nan Yao; Han Wang; Lina Jin; Bo Li
Journal:  BMC Public Health       Date:  2019-11-28       Impact factor: 3.295

5.  Detection of suspicious interactions of spiking covariates in methylation data.

Authors:  Miriam Sieg; Gesa Richter; Arne S Schaefer; Jochen Kruppa
Journal:  BMC Bioinformatics       Date:  2020-01-30       Impact factor: 3.169

6.  Income-based inequalities in self-reported moderate-to-vigorous physical activity among adolescents in England and the USA: a cross-sectional study.

Authors:  Shaun Scholes; Jennifer S Mindell
Journal:  BMJ Open       Date:  2021-02-15       Impact factor: 2.692

7.  Use of levamisole-adulterated cocaine is associated with increased load of white matter lesions.

Authors:  Florian Conrad; Sarah Hirsiger; Sebastian Winklhofer; Markus R Baumgartner; Philipp Stämpfli; Erich Seifritz; Susanne Wegener; Boris B Quednow
Journal:  J Psychiatry Neurosci       Date:  2021-04-12       Impact factor: 6.186

8.  Outcomes after a Grammont-style reverse total shoulder arthroplasty?

Authors:  Robert Z Tashjian; Bradley Hillyard; Victoria Childress; Jun Kawakami; Angela P Presson; Chong Zhang; Peter N Chalmers
Journal:  J Shoulder Elbow Surg       Date:  2020-06-09       Impact factor: 3.019

9.  Association between frailty and disability among rural community-dwelling older adults in Sri Lanka: a cross-sectional study.

Authors:  Dhammika Deepani Siriwardhana; Manuj Chrishantha Weerasinghe; Greta Rait; Shaun Scholes; Kate R Walters
Journal:  BMJ Open       Date:  2020-03-29       Impact factor: 2.692

10.  An efficient and accurate distributed learning algorithm for modeling multi-site zero-inflated count outcomes.

Authors:  Mackenzie J Edmondson; Chongliang Luo; Rui Duan; Mitchell Maltenfort; Zhaoyi Chen; Kenneth Locke; Justine Shults; Jiang Bian; Patrick B Ryan; Christopher B Forrest; Yong Chen
Journal:  Sci Rep       Date:  2021-10-04       Impact factor: 4.379

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.