| Literature DB >> 26634181 |
William D Johnson1, Jeffrey H Burton1, Robbie A Beyl1, Jacob E Romer1.
Abstract
Zero-inflated distributions are common in statistical problems where there is interest in testing homogeneity of two or more independent groups. Often, the underlying distribution that has an inflated number of zero-valued observations is asymmetric, and its functional form may not be known or easily characterized. In this case, comparisons of the groups in terms of their respective percentiles may be appropriate as these estimates are nonparametric and more robust to outliers and other irregularities. The median test is often used to compare distributions with similar but asymmetric shapes but may be uninformative when there are excess zeros or dissimilar shapes. For zero-inflated distributions, it is useful to compare the distributions with respect to their proportion of zeros, coupled with the comparison of percentile profiles for the observed non-zero values. A simple chi-square test for simultaneous testing of these two components is proposed, applicable to both continuous and discrete data. Results of simulation studies are reported to summarize empirical power under several scenarios. We give recommendations for the minimum sample size which is necessary to achieve suitable test performance in specific examples.Entities:
Keywords: Asymptotic Chi-Square Test; Equality of Quantiles; Large Sample Test; Nonparametric Test; Percentile Profiles; Zero-Inflated Distributions
Year: 2015 PMID: 26634181 PMCID: PMC4664523 DOI: 10.4236/ojs.2015.56050
Source DB: PubMed Journal: Open J Stat ISSN: 2161-718X
Example of contingency table for testing homogeneity of a three percentile profile.
| Sample | Bin 1 | Bin 2 | Bin 3 | Bin 4 | Total |
|---|---|---|---|---|---|
| 1 | 66 | 59 | 48 | 47 | 220 |
| 2 | 35 | 50 | 49 | 62 | 196 |
| 3 | 55 | 47 | 59 | 47 | 208 |
| Total | 156 | 156 | 156 | 156 | 624 |
Example of contingency table for testing homogeneity of a three percentile profile with added bin for zeros.
| Sample | Bin 1 | Bin 2 | Bin 3 | Bin 4 | Bin 5 | Total |
|---|---|---|---|---|---|---|
| 1 | 15 | 51 | 59 | 48 | 47 | 220 |
| 2 | 28 | 7 | 50 | 49 | 62 | 196 |
| 3 | 42 | 13 | 47 | 59 | 47 | 208 |
| Total | 85 | 71 | 156 | 156 | 156 | 624 |
Power simulations for testing Q = (1, 50, 75, 90) with zero-inflated gamma distributions.
| Sample Size ( | Gamma Distribution Parameters
| |||||
|---|---|---|---|---|---|---|
| 50 | 0.1 | 0.1 | 0.1301 | 0.2536 | 0.4214 | |
| 0.2 | 0.1605 | 0.2575 | 0.3918 | 0.5512 | ||
| 0.3 | 0.4914 | 0.5865 | 0.6786 | 0.7845 | ||
| 100 | 0.1 | 0.1 | 0.2349 | 0.4961 | 0.7627 | |
| 0.2 | 0.3089 | 0.5310 | 0.7169 | 0.8843 | ||
| 0.3 | 0.8419 | 0.9103 | 0.9582 | 0.9832 | ||
| 200 | 0.1 | 0.1 | 0.4684 | 0.8387 | 0.9785 | |
| 0.2 | 0.5975 | 0.8551 | 0.9672 | 0.9969 | ||
| 0.3 | 0.9938 | 0.9982 | 0.9998 | 1.0000 | ||
| 500 | 0.1 | 0.1 | 0.8954 | 0.9987 | 1.0000 | |
| 0.2 | 0.9619 | 0.9994 | 1.0000 | 1.0000 | ||
| 0.3 | 1.0000 | 1.0000 | 1.0000 | 1.0000 | ||
| 50 | 0.2 | 0.2 | 0.1212 | 0.2326 | 0.3845 | |
| 0.3 | 0.1211 | 0.2033 | 0.3123 | 0.4592 | ||
| 0.4 | 0.3719 | 0.4598 | 0.5620 | 0.6719 | ||
| 100 | 0.2 | 0.2 | 0.2140 | 0.4599 | 0.7229 | |
| 0.3 | 0.2178 | 0.4140 | 0.6173 | 0.8144 | ||
| 0.4 | 0.7015 | 0.8032 | 0.8920 | 0.9539 | ||
| 200 | 0.2 | 0.2 | 0.4294 | 0.7991 | 0.9691 | |
| 0.3 | 0.4218 | 0.7373 | 0.9243 | 0.9893 | ||
| 0.4 | 0.9585 | 0.9878 | 0.9967 | 0.9997 | ||
| 500 | 0.2 | 0.2 | 0.8592 | 0.9971 | 1.0000 | |
| 0.3 | 0.8459 | 0.9932 | 1.0000 | 1.0000 | ||
| 0.4 | 1.0000 | 1.0000 | 1.0000 | 1.0000 | ||
Power simulations for testing Q = (1, 50, 75, 90) with zero-inflated Poisson distributions.
| Sample Size ( | Poisson Distribution Parameters
| |||||
|---|---|---|---|---|---|---|
| 50 | 0.1 | 0.1 | 0.0876 | 0.2476 | 0.5205 | |
| 0.2 | 0.1561 | 0.2080 | 0.3850 | 0.6307 | ||
| 0.3 | 0.4850 | 0.5472 | 0.6791 | 0.8342 | ||
| 100 | 0.1 | 0.1 | 0.1489 | 0.5009 | 0.8671 | |
| 0.2 | 0.2944 | 0.4472 | 0.7256 | 0.9458 | ||
| 0.3 | 0.8241 | 0.8839 | 0.9578 | 0.9937 | ||
| 200 | 0.1 | 0.1 | 0.2588 | 0.8443 | 0.9960 | |
| 0.2 | 0.5757 | 0.7803 | 0.9692 | 0.9995 | ||
| 0.3 | 0.9911 | 0.9974 | 0.9996 | 1.0000 | ||
| 500 | 0.1 | 0.1 | 0.6354 | 0.9986 | 1.0000 | |
| 0.2 | 0.9549 | 0.9954 | 1.0000 | 1.0000 | ||
| 0.3 | 1.0000 | 1.0000 | 1.0000 | 1.0000 | ||
| 50 | 0.2 | 0.2 | 0.0867 | 0.2345 | 0.4736 | |
| 0.3 | 0.1187 | 0.1755 | 0.3174 | 0.5459 | ||
| 0.4 | 0.3757 | 0.4284 | 0.5616 | 0.7326 | ||
| 100 | 0.2 | 0.2 | 0.1426 | 0.4624 | 0.8281 | |
| 0.3 | 0.2111 | 0.3362 | 0.6256 | 0.8935 | ||
| 0.4 | 0.6941 | 0.7737 | 0.8940 | 0.9763 | ||
| 200 | 0.2 | 0.2 | 0.2526 | 0.7978 | 0.9910 | |
| 0.3 | 0.4098 | 0.6332 | 0.9326 | 0.9972 | ||
| 0.4 | 0.9596 | 0.9771 | 0.9978 | 1.0000 | ||
| 500 | 0.2 | 0.2 | 0.6168 | 0.9978 | 1.0000 | |
| 0.3 | 0.8422 | 0.9719 | 1.0000 | 1.0000 | ||
| 0.4 | 1.0000 | 1.0000 | 1.0000 | 1.0000 | ||
Contingency table for testing homogeneity of Q = (1, 50, 60, 70, 80, 90) of urinary triclosan for females.
| Females | Bin
| ||||||
|---|---|---|---|---|---|---|---|
| 1 | 2 (1.6, 7.2] | 3 (7.2, 13.6] | 4 (3.6, 23.9] | 5 (23.9, 84.3] | 6 (84.3, 258] | 7 | |
| Black Females | 58 (63.3) | 58 (46.2) | 23 (21.4) | 17 (21.8) | 21 (21.8) | 15 (21.8) | 26 (21.8) |
| White Females | 90 (84.7) | 50 (61.8) | 27 (28.6) | 34 (29.2) | 30 (29.2) | 36 (29.2) | 25 (29.2) |
Contingency table for testing homogeneity of Q = (1, 50, 60, 70, 80, 90) of urinary triclosan for males.
| Males | Bin
| ||||||
|---|---|---|---|---|---|---|---|
| 1 | 2 (1.6, 5.9] | 3 (5.9, ≤9.5] | 4 (9.5, 19.9] | 5 (19.9, 53.4] | 6 (53.4, 173] | 7 | |
| Black Males | 68 (68.5) | 52 (53.6) | 24 (22.5) | 24 (24.3) | 33 (23.9) | 25 (23.9) | 15 (24.3) |
| White Males | 84 (83.5) | 67 (65.4) | 26 (27.5) | 30 (29.7) | 20 (29.1) | 28 (29.1) | 39 (29.7) |
Contingency table for testing homogeneity of Q = (1, 50, 60, 70, 80, 90) of serum cotinine for females.
| Females | Bin
| ||||||
|---|---|---|---|---|---|---|---|
| 1 | 2 (0.011, 0.04] | 3 (0.04, 0.09] | 4 (0.09, 0.57] | 5 (0.57, 68.9] | 6 (68.9, 234] | 7 | |
| Black Females | 134 (192) | 158 (152) | 86 (68) | 93 (69) | 93 (69) | 58 (69) | 64 (69) |
| White Females | 291 (233) | 179 (185) | 65 (83) | 59 (83) | 59 (83) | 94 (83) | 88 (83) |
Contingency table for testing homogeneity of Q = (1, 50, 60, 70, 80, 90) of serum cotinine for males.
| Males | Bin
| ||||||
|---|---|---|---|---|---|---|---|
| 1 | 2 (0.011, 0.16] | 3 (0.16, 1.93] | 4 (1.93, 84.5] | 5 (84.5, 202] | 6 (202, 313] | 7 | |
| Black Males | 78 (123) | 201 (188) | 80 (62) | 81 (62) | 77 (62) | 47 (63) | 59 (62) |
| White Males | 217 (172) | 249 (262) | 69 (87) | 68 (87) | 72 (87) | 104 (88) | 88 (86) |
Figure 1Cumulative distribution functions of log (serum cotinine) for black and white males, including dots for the respective 50th, 60th, 70th, 80th, and 90th sample percentiles. Vertical lines indicate location of combined sample percentiles.