| Literature DB >> 27933230 |
Victoria Gamerman1, Matthew Guerra2, Justine Shults3.
Abstract
This manuscript implements a maximum likelihood based approach that is appropriate for equally spaced longitudinal count data with over-dispersion, so that the variance of the outcome variable is larger than expected for the assumed Poisson distribution. We implement the proposed method in the analysis of seizure data and a subset of German Socio-Economic Panel data. To demonstrate the importance of correctly modeling the over-dispersion, we make comparisons with the semi-parametric generalized estimating equations approach that incorrectly ignores any over-dispersion in the data. Our simulations demonstrate that accounting for over-dispersion results in improved small-sample efficiency and appropriate coverage probabilities. We also provide code in R so that readers can implement our approach in their own analyses.Entities:
Keywords: Count data; First-order antedependence; Generalized estimating equations; Markov property; Maximum likelihood estimation; Over-dispersion
Year: 2016 PMID: 27933230 PMCID: PMC5101264 DOI: 10.1186/s40064-016-3564-8
Source DB: PubMed Journal: Springerplus ISSN: 2193-1801
Estimated parameters from the ML, GEE, and Poisson models in the analysis of the doctor visits data
| Parameter | Estimate | SE | Wald | Pr(>|W|) |
|---|---|---|---|---|
|
| ||||
|
| ||||
| (Intercept) | −0.461 | 0.2811 | 2.69 | 0.1008 |
| Reform | −0.113 | 0.0241 | 21.99 | <0.0001 |
| Age | 0.005 | 0.0014 | 12.22 | 0.0005 |
| Education | −0.008 | 0.0064 | 1.54 | 0.2153 |
| Marital status | 0.026 | 0.0294 | 0.75 | 0.3855 |
| Health status | 1.100 | 0.0313 | 1238.28 | <0.0001 |
| Log income | 0.150 | 0.0376 | 15.83 | <0.0001 |
| Correlation parameters | ||||
| Alpha | 0.313 | 0.0208 | ||
| GEE approach | ||||
| (Intercept) | −0.381 | 0.5766 | 0.44 | 0.5083 |
| Reform | −0.123 | 0.0529 | 5.40 | 0.0202 |
| Age | 0.005 | 0.0033 | 2.44 | 0.1182 |
| Education | −0.009 | 0.0118 | 0.61 | 0.4349 |
| Marital status | 0.038 | 0.0698 | 0.30 | 0.5822 |
| Health status | 1.105 | 0.0873 | 160.23 | <0.0001 |
| Log income | 0.139 | 0.0798 | 3.05 | 0.0809 |
| Correlation parameters | ||||
| Alpha | 0.213 | 0.0238 | ||
Mean and variance for the placebo and treatment groups
| Variable | Placebo | Drug | Total |
|---|---|---|---|
| (n = 28) | (n = 31) | (n = 59) | |
| Y1 | 9.86 (102.8) | 8.58 (332.7) | 8.95 (220.2) |
| Y2 | 8.29 (66.6) | 8.42 (140.7) | 8.36 (103.8) |
| Y3 | 8.79 (215.2) | 8.13 (192.9) | 8.44 (200.2) |
| Y4 | 7.96 (58.2) | 6.71 (126.8) | 7.31 ( 93.1) |
| Baseline | 30.79 (681.2) | 31.61 (782.9) | 31.22 (722.5) |
| Age | 29.00 (36.0) | 27.74 (43.6) | 28.34 (39.7) |
Values in the table represent the mean (variance)
Estimated parameters from the GEE and ML approaches for analysis of the epilepsy data when Period is included in the models
| Parameter | Estimate | SE | Wald | Pr(>|W|) |
|---|---|---|---|---|
| ML approach ( | ||||
|
| ||||
| (Intercept) | 0.6569 | 0.1958 | 11.26 | 0.0008 |
| Treatment | −0.1668 | 0.0667 | 6.26 | 0.0124 |
| Baseline | 0.0232 | 0.0007 | 1111.24 | <0.0001 |
| Age | 0.0238 | 0.0056 | 17.94 | <0.0001 |
| Period | −0.0634 | 0.0215 | 8.72 | 0.0032 |
| Correlation parameters | ||||
| Alpha | 0.416 | 0.0334 | ||
| GEE approach | ||||
| (Intercept) | 0.5855 | 0.3491 | 2.81 | 0.0936 |
| Treatment | −0.1642 | 0.1589 | 1.07 | 0.3014 |
| Baseline | 0.0232 | 0.0012 | 350.97 | <0.0001 |
| Age | 0.0263 | 0.0118 | 4.95 | 0.0261 |
| Period | −0.0644 | 0.0340 | 3.59 | 0.0580 |
| Correlation parameters | ||||
| Alpha | 0.551 | 0.0656 | ||
Estimated parameters from the GEE and ML approaches for analysis of the epilepsy data when period is not included in the models
| Parameter | Estimate | SE | Wald | Pr(>|W|) |
|---|---|---|---|---|
|
| ||||
| ML approach ( | ||||
| (Intercept) | 0.5072 | 0.1894 | 7.17 | 0.0074 |
| Treatment | −0.1673 | 0.0667 | 6.30 | 0.0121 |
| Baseline | 0.0232 | 0.0007 | 1113.57 | <.0001 |
| Age | 0.0238 | 0.0056 | 17.99 | <.0001 |
| Correlation parameters | ||||
| Alpha | 0.423 | 0.0342 | ||
| GEE approach | ||||
| (Intercept) | 0.4467 | 0.3621 | 1.52 | 0.2174 |
| Treatment | −0.1659 | 0.1593 | 1.09 | 0.2977 |
| Baseline | 0.0232 | 0.0012 | 353.32 | <.0001 |
| Age | 0.0258 | 0.0117 | 4.86 | 0.0275 |
| Correlation parameters | ||||
| Alpha | 0.544 | 0.0639 | ||
Small sample efficiencies for evaluating the AR(1) correlation structure for varying values of and sample size per group
|
|
| R* |
|
|
|
|
|
|---|---|---|---|---|---|---|---|
| Mean squared error using ML | |||||||
| 60 | 0.2 | 1000 | 0.056 | 0.355 | 0.297 | 0.291 | 0.609 |
| 0.4 | 1000 | 0.088 | 0.503 | 0.427 | 0.445 | 0.505 | |
| 0.6 | 1000 | 0.127 | 0.803 | 0.642 | 0.619 | 0.308 | |
| 0.7 | 998 | 0.132 | 0.908 | 0.716 | 0.656 | 0.171 | |
| 120 | 0.2 | 1000 | 0.029 | 0.176 | 0.138 | 0.137 | 0.305 |
| 0.4 | 1000 | 0.040 | 0.254 | 0.203 | 0.194 | 0.236 | |
| 0.6 | 1000 | 0.054 | 0.381 | 0.291 | 0.294 | 0.124 | |
| 0.7 | 1000 | 0.067 | 0.489 | 0.349 | 0.325 | 0.067 | |
| 300 | 0.2 | 1000 | 0.010 | 0.071 | 0.057 | 0.054 | 0.111 |
| 0.4 | 1000 | 0.016 | 0.101 | 0.084 | 0.078 | 0.080 | |
| 0.6 | 1000 | 0.025 | 0.153 | 0.121 | 0.118 | 0.047 | |
| 0.7 | 1000 | 0.029 | 0.174 | 0.144 | 0.140 | 0.023 | |
| Mean squared error using GEE | |||||||
| 60 | 0.2 | 1000 | 0.057 | 0.355 | 0.300 | 0.290 | 0.668 |
| 0.4 | 1000 | 0.089 | 0.516 | 0.427 | 0.450 | 0.701 | |
| 0.6 | 1000 | 0.137 | 0.852 | 0.703 | 0.653 | 0.571 | |
| 0.7 | 1000 | 0.160 | 1.133 | 0.883 | 0.795 | 0.424 | |
| 120 | 0.2 | 1000 | 0.029 | 0.176 | 0.139 | 0.138 | 0.340 |
| 0.4 | 1000 | 0.040 | 0.260 | 0.204 | 0.198 | 0.334 | |
| 0.6 | 1000 | 0.062 | 0.415 | 0.327 | 0.325 | 0.240 | |
| 0.7 | 1000 | 0.083 | 0.595 | 0.435 | 0.402 | 0.178 | |
| 300 | 0.2 | 1000 | 0.010 | 0.072 | 0.058 | 0.054 | 0.128 |
| 0.4 | 1000 | 0.017 | 0.103 | 0.085 | 0.079 | 0.124 | |
| 0.6 | 1000 | 0.027 | 0.162 | 0.132 | 0.129 | 0.093 | |
| 0.7 | 1000 | 0.036 | 0.211 | 0.182 | 0.176 | 0.066 | |
The true correlation structure is AR(1)
There are equal sample sizes of per group and ; [1] True value by a factor of ; [2] True value by a factor of
Percent bias for evaluating the AR(1) correlation structure for varying values of and sample size per group
|
|
| R* |
|
|
|
|
|
|---|---|---|---|---|---|---|---|
| Percent bias using ML | |||||||
| 60 | 0.2 | 1000 | 2.57 | 0.53 | −0.61 | −0.53 | 9.41 |
| 0.4 | 1000 | 6.33 | −0.42 | −1.15 | −2.31 | 5.15 | |
| 0.6 | 1000 | 1.95 | 0.05 | −0.95 | 0.27 | 2.65 | |
| 0.7 | 998 | −2.21 | 2.71 | 0.72 | 1.21 | 0.69 | |
| 120 | 0.2 | 1000 | −0.04 | 0.14 | 0.07 | 0.20 | 5.30 |
| 0.4 | 1000 | 2.25 | −0.52 | −0.57 | −0.65 | 2.74 | |
| 0.6 | 1000 | 0.43 | −0.79 | 0.08 | −0.13 | 1.39 | |
| 0.7 | 1000 | 2.00 | 0.15 | −0.01 | −1.04 | 0.17 | |
| 300 | 0.2 | 1000 | 0.68 | −0.18 | −0.57 | 0.22 | 2.83 |
| 0.4 | 1000 | 0.85 | −0.29 | −0.16 | −0.25 | 1.31 | |
| 0.6 | 1000 | 1.91 | −0.38 | −0.30 | −0.75 | 0.53 | |
| 0.7 | 1000 | 1.47 | −0.29 | −0.14 | −0.63 | −0.03 | |
| Percent bias using GEE | |||||||
| 60 | 0.2 | 1000 | 2.48 | 0.54 | −0.60 | −0.49 | 10.94 |
| 0.4 | 1000 | 6.26 | −0.34 | −1.10 | −2.29 | 6.06 | |
| 0.6 | 1000 | 1.88 | 0.64 | −0.90 | 0.45 | 4.86 | |
| 0.7 | 1000 | 0.60 | 1.87 | −0.28 | 0.84 | 4.60 | |
| 120 | 0.2 | 1000 | −0.22 | 0.07 | 0.13 | 0.24 | 6.19 |
| 0.4 | 1000 | 1.95 | −0.51 | −0.40 | −0.64 | 2.89 | |
| 0.6 | 1000 | 0.16 | −1.05 | 0.34 | −0.21 | 2.18 | |
| 0.7 | 1000 | 1.74 | −0.22 | 0.14 | −0.83 | 2.55 | |
| 300 | 0.2 | 1000 | 0.65 | −0.23 | −0.57 | 0.23 | 2.87 |
| 0.4 | 1000 | 0.72 | −0.32 | −0.15 | −0.18 | 0.98 | |
| 0.6 | 1000 | 2.03 | −0.33 | −0.23 | −0.88 | 0.86 | |
| 0.7 | 1000 | 2.83 | −0.20 | −0.58 | −0.91 | 1.64 | |
The true correlation structure is AR(1)
There are equal sample sizes of per group and =
Coverage probabilities for the ML and GEE approaches with the AR(1) correlation structure for varying values of and sample size per group
|
|
| Method | R | Coverage Probability | ||||
|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
| ||||
| 60 | 0.2 | ML | 1000 | 94.7 | 95.2 | 95.5 | 95.5 | 93.8 |
| GEE | 1000 | 94.4 | 95.0 | 94.8 | 95.1 | 91.1 | ||
| 0.4 | ML | 1000 | 93.8 | 94.6 | 95.9 | 93.0 | 94.6 | |
| GEE | 1000 | 93.2 | 94.3 | 95.5 | 92.7 | 86.1 | ||
| 0.6 | ML | 1000 | 93.8 | 93.9 | 94.3 | 94.0 | 93.4 | |
| GEE | 1000 | 94.1 | 93.6 | 95.1 | 93.1 | 83.2 | ||
| 0.7 | ML | 998 | 95.4 | 95.3 | 95.4 | 95.5 | 92.3 | |
| GEE | 1000 | 95.0 | 94.9 | 94.0 | 95.7 | 84.6 | ||
| 120 | 0.2 | ML | 1000 | 94.7 | 95.2 | 95.2 | 94.8 | 92.9 |
| GEE | 1000 | 94.2 | 95.1 | 94.9 | 94.5 | 91.3 | ||
| 0.4 | ML | 1000 | 95.1 | 96.1 | 95.6 | 94.7 | 95.1 | |
| GEE | 1000 | 95.2 | 96.0 | 95.5 | 94.5 | 85.4 | ||
| 0.6 | ML | 1000 | 95.9 | 94.5 | 95.3 | 94.9 | 95.5 | |
| GEE | 1000 | 95.5 | 95.5 | 95.5 | 94.9 | 84.5 | ||
| 0.7 | ML | 1000 | 95.3 | 94.2 | 94.7 | 96.2 | 92.9 | |
| GEE | 1000 | 95.3 | 94.2 | 95.0 | 95.9 | 87.2 | ||
| 300 | 0.2 | ML | 1000 | 95.2 | 95.0 | 94.7 | 94.7 | 94.5 |
| GEE | 1000 | 95.6 | 95.3 | 94.8 | 94.6 | 91.5 | ||
| 0.4 | ML | 1000 | 93.5 | 95.4 | 94.2 | 93.9 | 96.5 | |
| GEE | 1000 | 93.7 | 96.0 | 94.9 | 94.3 | 86.2 | ||
| 0.6 | ML | 1000 | 93.2 | 95.4 | 94.9 | 94.0 | 95.2 | |
| GEE | 1000 | 93.8 | 95.6 | 94.6 | 94.9 | 85.9 | ||
| 0.7 | ML | 1000 | 94.5 | 95.1 | 94.1 | 94.4 | 92.4 | |
| GEE | 1000 | 94.8 | 95.9 | 94.6 | 94.8 | 88.0 | ||
The true correlation structure is AR(1)
There are equal sample sizes of per group and =