| Literature DB >> 16551368 |
Donald J Slymen1, Guadalupe X Ayala, Elva M Arredondo, John P Elder.
Abstract
Counting outcomes such as days of physical activity or servings of fruits and vegetables often have distributions that are highly skewed toward the right with a preponderance of zeros, posing analytical challenges. This paper demonstrates how such outcomes may be analyzed with several modifications to Poisson regression. Five regression models 1) Poisson, 2) overdispersed Poisson, 3) negative binomial, 4) zero-inflated Poisson (ZIP), and 5) zero-inflated negative binomial (ZINB) are fitted to data assessing predictors of vigorous physical activity (VPA) among Latina women. The models are described, and analytical and graphical approaches are discussed to aid in model selection. Poisson regression provided a poor fit where 82% of the subjects reported no days of VPA. The fit improved considerably with the negative binomial and ZIP models. There was little difference in fit between the ZIP and ZINB models. Overall, the ZIP model fit best. No days of VPA were associated with poorer self-reported health and less assimilation to Anglo culture, and marginally associated with increasing BMI. The intensity portion of the model suggested that increasing days of VPA were associated with more education, and marginally associated with increasing age. These underutilized models provide useful approaches for handling counting outcomes.Entities:
Year: 2006 PMID: 16551368 PMCID: PMC1448198 DOI: 10.1186/1742-5573-3-3
Source DB: PubMed Journal: Epidemiol Perspect Innov ISSN: 1742-5573
Descriptive Statistics for Number of days of VPA and Predictor Variables
| Variable | Frequency (%) or Mean (SD) |
| Number of days/week of VPA of 20 minutes or more | |
| 0 | 294 (82.4) |
| 1 | 10 (2.8) |
| 2 | 12 (3.4) |
| 3 | 17 (4.8) |
| 4 | 7 (2.0) |
| 5 | 10 (2.8) |
| 6 | 0 (0.0) |
| 7 | 7 (2.0) |
| Current employment status | |
| Full-time | 91 (25.5) |
| Part-time | 50 (14.0) |
| Self-employed | 33 (9.2) |
| Homemaker/other unemployed | 183 (51.3) |
| Years of formal education | |
| Through 6th grade | 95 (26.6) |
| Middle school | 89 (24.9) |
| High school | 76 (21.3) |
| Any college | 97 (27.2) |
| Marital status | |
| Married | 280 (79.1) |
| Not married | 74 (20.9) |
| Smoked cigarettes past 30 days | |
| Yes | 52 (14.6) |
| No | 304 (85.4) |
| Self-reported health | |
| Excellent (1) | 20 (5.6) |
| Very good (2) | 34 (9.6) |
| Good (3) | 103 (28.9) |
| Fair (4) | 183 (51.4) |
| Poor (5) | 16 (4.5) |
| Numerical score mean (sd) | 3.4 (0.93) |
| Body Mass Index | 29.6 (5.56) |
| Household size | 4.7 (1.78) |
| Age (years) | 39.7 (9.93) |
| Acculturation score | -1.82 (0.90) |
Poisson, Over-dispersed Poisson using GEE, and Negative Binomial Models for Number of Days of VPA
| Over-dispersed | ||||||||
| Poisson Regression | Poisson | Negative Bionomial | ||||||
| Beta | SE | p-value | SE | p-value | Beta | SE | P-value | |
| Current employment status | ||||||||
| Full-time | Reference | Reference | Reference | |||||
| Part-time | 0.474 | 0.235 | 0.044 | 0.419 | 0.26 | 0.274 | 0.569 | 0.63 |
| Self-employed | 0.595 | 0.274 | 0.030 | 0.512 | 0.25 | 0.177 | 0.716 | 0.81 |
| Homemaker/other unemployed | 0.493 | 0.201 | 0.014 | 0.405 | 0.22 | 0.434 | 0.450 | 0.33 |
| Years of formal education | ||||||||
| Through 6th grade | Reference | Reference | Reference | |||||
| Middle school | -0.006 | 0.249 | 0.98 | 0.431 | 0.99 | 0.356 | 0.527 | 0.50 |
| High school | 0.349 | 0.237 | 0.14 | 0.471 | 0.46 | 0.236 | 0.536 | 0.66 |
| Any college | 0.369 | 0.231 | 0.11 | 0.445 | 0.41 | -0.041 | 0.605 | 0.95 |
| Smoked cigarettes past 30 days (Y/N) | -0.398 | 0.217 | 0.066 | 0.361 | 0.27 | -0.884 | 0.552 | 0.11 |
| Body Mass Index | -0.069 | 0.016 | <0.0001 | 0.025 | 0.006 | -0.094 | 0.042 | 0.025 |
| Marital status (Married/Not married) | -0.194 | 0.212 | 0.36 | 0.423 | 0.65 | -0.416 | 0.477 | 0.38 |
| Self-reported health (1=Excellent, 5=Poor) | -0.357 | 0.072 | <0.0001 | 0.123 | 0.004 | -0.572 | 0.233 | 0.014 |
| Household size | 0.001 | 0.046 | 0.99 | 0.108 | 0.99 | -0.044 | 0.089 | 0.63 |
| Age (5 yr interveal) | -0.019 | 0.040 | 0.64 | 0.089 | 0.84 | -0.083 | 0.090 | 0.36 |
| Acculturation score | 0.267 | 0.086 | 0.002 | 0.128 | 0.037 | 0.556 | 0.271 | 0.040 |
Model Fit Characteristics: Log-likelihood and Akaike's Information Criterion
| Model | Log-likelihood | AIC |
| Poisson | - 421.5 | 871.1 |
| Negative binomial | - 282.8 | 595.6 |
| Zero-inflated Poisson | - 253.3 | 562.5 |
| Zero-inflated negative binomial | - 253.5 | 565.0 |
Figure 1Observed minus expected probabilities for four models.
Zero-inflated Poisson Model for Number of Days of VPA
| Logistic Portion1 | Poisson Portion | |||||
| Odds Ratio | 95% CI | p-value | Rate Ratio | 95% CI | p-value | |
| Current employment status | ||||||
| Full-time | Reference | Reference | ||||
| Part-time | 0.395 | 0.147, 1.071 | 0.068 | 0.635 | 0.368, 1.098 | 0.10 |
| Self-employed | 0.645 | 0.194, 2.147 | 0.47 | 1.080 | 0.544, 2.143 | 0.83 |
| Homemaker/other unemployed | 0.603 | 0.265, 1.370 | 0.23 | 0.866 | 0.544, 1.379 | 0.54 |
| Years of formal education | ||||||
| Through 6th grade | Reference | Reference | ||||
| Middle school | 1.370 | 0.521, 3.597 | 0.52 | 1.331 | 0.748, 2.370 | 0.33 |
| High school | 1.649 | 0.609, 4.482 | 0.32 | 2.119 | 1.208,3.706 | 0.009 |
| Any college | 1.608 | 0.604, 4.263 | 0.34 | 2.121 | 1.185, 3.781 | 0.012 |
| Smoked cigarettes past 30 days (Y/N) | 1.465 | 0.563, 3.819 | 0.43 | 0.907 | 0.513, 1.603 | 0.74 |
| Body Mass Index | 1.066 | 0.997, 1.140 | 0.062 | 0.987 | 0.935, 1.042 | 0.63 |
| Marital status (Married/Not married) | 1.511 | 0.629, 3.633 | 0.35 | 1.174 | 0.732, 1.880 | 0.51 |
| Self-reported health (1=Excellent, 5=Poor) | 1.481 | 1.057, 2.077 | 0.023 | 0.932 | 0.764, 1.137 | 0.49 |
| Household size | 1.094 | 0.897, 1.335 | 0.38 | 1.040 | 0.953, 1.135 | 0.38 |
| Age (years) | 1.111 | 0.937, 1.323 | 0.23 | 1.089 | 0.994, 1.191 | 0.068 |
| Acculturation score | 0.574 | 0.388, 0.849 | 0.006 | 0.840 | 0.676, 1.044 | 0.12 |
1 Models the probability of no vigorous physical activity