| Literature DB >> 30342488 |
Christopher J Sroka1, Haikady N Nagaraja2.
Abstract
BACKGROUND: The odds ratio (OR) is used as an important metric of comparison of two or more groups in many biomedical applications when the data measure the presence or absence of an event or represent the frequency of its occurrence. In the latter case, researchers often dichotomize the count data into binary form and apply the well-known logistic regression technique to estimate the OR. In the process of dichotomizing the data, however, information is lost about the underlying counts which can reduce the precision of inferences on the OR.Entities:
Keywords: Binary data; Confidence intervals; Count data; Fisher information; Maximum likelihood
Mesh:
Year: 2018 PMID: 30342488 PMCID: PMC6195979 DOI: 10.1186/s12874-018-0568-9
Source DB: PubMed Journal: BMC Med Res Methodol ISSN: 1471-2288 Impact factor: 4.615
0. The r.v. Z follows the Bernoulli distribution with mean μ=p and variance v(μ)=p(1−p)=μ(1−μ). Let θ denote the odds of a positive count; that is,
OR estimates and 95% confidence intervals from geometric and logistic regression models of physician visits for the German Socio-Economic Panel data
| Geometric | Logistic | Relative widthe | |||
|---|---|---|---|---|---|
| Covariate | Estimate | Interval | Estimate | Interval | |
| Post-reforma | 0.87 | (0.79, 0.96) | 0.82 | (0.68, 0.99) | 57% |
| Bad healthb | 3.13 | (2.71, 3.63) | 3.28 | (2.24, 4.82) | 36% |
| Education (10.5 - 12 years)c | 1.09 | (0.95, 1.24) | 1.19 | (0.94, 1.51) | 50% |
| Education (HS graduate +)c | 0.97 | (0.84, 1.11) | 1.32 | (1.03, 1.70) | 40% |
| Age (40 - 49 years)d | 1.05 | (0.93, 1.19) | 0.92 | (0.73, 1.16) | 62% |
| Age (50 - 60 years)d | 1.21 | (1.05, 1.39) | 0.99 | (0.76, 1.29) | 64% |
| Log household income | 1.13 | (0.99, 1.30) | 1.26 | (0.98, 1.62) | 48% |
Reference category: a Pre-reform; b Good health; c 7 - 10 years of education; d Age 20–39 years e Geometric compared to logistic model
OR estimates and 95% confidence intervals from Poisson and logistic regression models of physician visits for the Australian Health Survey data
| Poisson | Logistic | Relative widthd | |||
|---|---|---|---|---|---|
| Covariate | Estimate | Interval | Estimate | Interval | |
| Female | 1.28 | (1.11, 1.47) | 1.30 | (1.11, 1.53) | 85.85% |
| Agea | 1.67 | (1.10, 2.52) | 1.71 | (1.06, 2.74) | 84.71% |
| Incomeb | 0.82 | (0.66, 1.01) | 0.95 | (0.75, 1.20) | 76.49% |
| Private insurancec | 1.17 | (0.98, 1.39) | 1.30 | (1.07, 1.59) | 78.51% |
| Free government insurance (low income)c | 0.55 | (0.36, 0.85) | 0.50 | (0.30, 0.84) | 89.64% |
| Free government insurance (old age, disability, veteran)c | 1.20 | (0.95, 1.53) | 1.53 | (1.16, 2.01) | 68.28% |
| Number of illnesses in past two weeks | 1.30 | (1.24, 1.36) | 1.32 | (1.25, 1.39) | 85.36% |
| Number of days of reduced activity in past two weeks | 1.24 | (1.21, 1.26) | 1.17 | (1.14, 1.20) | 78.17% |
| General health questionnaire score | 1.05 | (1.02, 1.08) | 1.06 | (1.03, 1.10) | 83.68% |
| Has chronic condition that limits activity | 1.17 | (0.98, 1.41) | 1.19 | (0.96, 1.48) | 84.20% |
a Age in years divided by 100 b Annual income in tens of thousands of dollars c Reference category: government Medibank insurance d Poisson compared to logistic model
Fig. 1Comparison of information gains across count models. The figure shows FI(Y;θ)/FI(Z;θ) as a function of the p= Pr(Y>0) for various count models. As the probability of a positive count increases, the relative FI increases. The gains are largest for models with the most dispersion in the counts