| Literature DB >> 25642699 |
Skip Garibaldi1, Kayla Frisoli2, Li Ke2, Melody Lim2.
Abstract
We analyze the spending of individuals in the United States on lottery tickets in an average month, as reported in surveys. We view these surveys as sampling from an unknown distribution, and we use non-parametric methods to compare properties of this distribution for various demographic groups, as well as claims that some properties of this distribution are constant across surveys. We find that the observed higher spending by Hispanic lottery players can be attributed to differences in education levels, and we dispute previous claims that the top 10% of lottery players consistently account for 50% of lottery sales.Entities:
Mesh:
Year: 2015 PMID: 25642699 PMCID: PMC4314186 DOI: 10.1371/journal.pone.0115730
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Kernel density plot and boxplot of nonzero values of reported lottery spending in an average month, obtained by combining the results of the Indiana, Gallup, and Texas surveys.
Fig 2Q-Q Plots showing (left) that the data is different from survey to survey and (right) that the data is also non-normal.
In each panel, if the two samples were drawn from the same distribution, the plotted points would lie approximately on the diagonal line joining the lower-left and upper-right corners.
Spending levels by various demographic groups.
The 95% confidence interval is the smallest median of difference one could find to produce a p-value of 0.05.
| Demographic group | median of difference | est. p-value | 95% confidence interval |
|---|---|---|---|
| Male | 1 | 0.83 | |
| African American | 6 | < 0.01 | > 2 |
| Hispanic | 3 | < 0.01 | > 2 |
| HS dropout | 5 | < 0.01 | > 1 |
| cell phone | 2 | 0.66 |
Representation of demographic groups in the top 20% of lottery players.
P-values and confidence intervals are calculated using a binomial distribution based on the percentage of adults in the sample.
| Demographic group | Percentage of adults in sample | Percentage of heaviest players | p-value | 95% confidence interval |
|---|---|---|---|---|
| Male | 50% | 55% | 0.005 | ≥ 53% |
| African American | 11% | 18% | < 0.001 | ≥ 13% |
| Hispanic | 14% | 20% | < 0.001 | ≥ 17% |
| HS dropout | 5% | 9% | < 0.001 | ≥ 7% |
| cell phone | 15% | 19% | 0.011 | ≥ 18% |
Columns 2 and 3 report the % of total lottery spending due to the top 10% and 20% of spenders respectively.
Column 4 reports the skewness and column 5 reports the skewness-adjusted kurtosis defined in [43] for the spending reported in each survey. The bottom three rows give the results of a 6-sample permutation test with test statistic the standard deviation of the numbers in the higher rows.
| Survey | top 10% | top 20% | skewness | adj. kurtosis |
|---|---|---|---|---|
| Texas | 55% | 72% | 4.2 | 3.3 |
| Gallup | 62% | 75% | 10.2 | 5.2 |
| Indiana | 56% | 70% | 10.5 | 7.0 |
| CBS | 45% | 64% | 3.4 | 3.1 |
| Kentucky | 38% | 55% | 4.5 | 4.3 |
| Minnesota | 63% | 76% | 6.3 | 4.2 |
| Standard deviation | 9.8% | 7.8% | 3.1 | 1.4 |
| Estimated p-value | ≈ 0 | ≈ 0 | 0.03 | 0.58 |
| 95% confidence interval | ≥ 5.8% | ≥ 4.2% | ≥ 2.93 |