| Literature DB >> 27600394 |
E Amene1, B Horn2, R Pirie2, R Lake2, D Döpfer3.
Abstract
BACKGROUND: Data containing notified cases of disease are often compromised by incomplete or partial information related to individual cases. In an effort to enhance the value of information from enteric disease notifications in New Zealand, this study explored the use of Bayesian and Multiple Imputation (MI) models to fill risk factor data gaps. As a test case, overseas travel as a risk factor for infection with campylobacteriosis has been examined.Entities:
Keywords: Bayesian specification; Campylobacteriosis; Missing value; Multiple imputation
Mesh:
Year: 2016 PMID: 27600394 PMCID: PMC5011939 DOI: 10.1186/s12879-016-1784-8
Source DB: PubMed Journal: BMC Infect Dis ISSN: 1471-2334 Impact factor: 3.090
Total number of campylobacteriosis notification in New Zealand residents categorized by information on overseas travel (2000–2010)
| Travel status | Campylobacteriosis status | ||||
|---|---|---|---|---|---|
| Confirmed | Probable | Under investigation | Unknown | Total | |
| No | 41617 | 60 | 52 | 416 | 42145 |
| Unknown | 74481 | 110 | 222 | 1653 | 76466 |
| Yes | 3100 | 7 | 7 | 39 | 3153 |
| Total | 119198 | 177 | 281 | 2108 | 121764 |
Description of variables in the New Zealand campylobacteriosis notification and short term international travelers’ datasets (2000–2010)
| Variables | Details |
|---|---|
| Deprivation index | Categorical, 1–10 scale (1 = least deprived, 10 = most deprived) |
| Urban | Numeric, Proportion of DHB population under urban influence |
| DHB | Categorical, Residence District Health Board |
| Travel rate | Numeric, Residence DHB’s rate of short term international travel |
| Report date | Year of campylobacteriosis notification, 2000-2010 |
| Age | Four categories; <5, 5–19, 20–65 and 65+ Years |
| Sex | Two categories; Male and Female |
| Season | Four categories; Spring (Sep-Nov), Summer (Dec-Feb), Autumn (Mar-May) & Winter (Jun-Aug) |
| Overseas travel | Three categories; Yes, No, Unknown (62 % of the cases did not have travel information.) |
| Intervention | A binary indicator variable to identify before and after the 2006 poultry intervention period. |
Notes: Deprivation index, Urban, DHB and Travel Rate are DHB level variables, whereas Report Date, Age, Season, Overseas Travel and Intervention are measured at an individual case level
Fig. 4Comparison of Bayesian and Multiple Imputation models regarding the mean and 95 % Credibility (Confidence) Intervals of regression coefficients for 10 % (Fig. 4a), 50 % (Fig. 4b), 65 % (Fig. 4c) and 80 % (Fig. 4d) missing data category as compared to the complete data on overseas travel status of campylobacteriosis cases (n = 44,285). Notes: (1) * Deprivation index (scale 1–10, 1 = least deprived and 10 = most deprived District Health Board; **proportion of DHB population under urban influence;*** Short term international travel per 100 residents of a DHB; ****a binary indicator variable to identify cases that were reported before or after 2006 poultry intervention period. (2) Complete cases: regression coefficients estimated from campylobacteriosis notification data with complete information on overseas travel. (3) The error bars indicate the 95 % confidence intervals (in Multiple Imputation models) and 95 % Credibility Intervals (in Bayesian models) of the regression coefficients
Fig. 1Distribution of campylobacteriosis notification categorized by the status of overseas travel (upper panel) and the annual proportion of short term international travels (lower panel), in DHBs of New Zealand (2000 – 2010). Notes: Upper panel: campylobacteriosis notification in 1000s is the sum of all cases notified between 2000 and 2010 in a given District Health Board; lower panel: Total travels/total population: the average number of outbound travels per year divided by the average population size per year between 2000 and 2010 for a given District Health Board
Fig. 2Annual short term international travel and campylobacteriosis notification of New Zealand residents (2000–2010). *Total notified cases: total number of campylobacteriosis cases notified between 2000 and 2010. **Observed travel associated cases: campylobacteriosis cases that had confirmed overseas travel during the incubation period of the disease. ***Total travels: total number of short term international travels between 2000 and 2010. Short term international travel is defined as international departures of New Zealand residents for an intended period of less than 12 months (Statistics New Zealand [www.stats.govt.nz])
Fig. 3The proportion of campylobacteriosis notifications in New Zealand with known status of overseas travel information (2000–2010)
Summary of logistic regression analysis for variables predicting missing indicator (1 = missing overseas travel information, 0 = otherwise) to test the validity of Missing At Random assumption (n = 116721)
| Coefficients | Estimate | Std. Error | Pr(>|z|) |
|---|---|---|---|
| (Intercept) | −8.757 | 0.089 | <0.001 |
| Urbana | 2.992 | 0.103 | <0.001 |
| DepIndexb | 0.525 | 0.006 | <0.001 |
| Travel Ratec | 0.081 | 0.001 | <0.001 |
| Age (5–19) | 0.154 | 0.027 | <0.001 |
| Age (20–59) | 0.033 | 0.023 | 0.145 |
| Age (60+) | −0.142 | 0.027 | <0.001 |
| Summer | 0.014 | 0.018 | 0.443 |
| Autumn | −0.002 | 0.021 | 0.94 |
| Winter | 0.035 | 0.021 | 0.085 |
| Male | 0.153 | 0.014 | <0.001 |
| Interventiond | 0.345 | 0.016 | <0.001 |
Keys: aProportion of DHB population under urban influence; bDeprivation index (scale 0–10, 0 being least deprived and 10 being most deprived DHB; cShort term international travel per 100 residents of a DHB; dA binary indicator variable to identify pre and post 2006 intervention. Age (<5), Spring, and Female sex are reference categories
Comparison of Brier Score and Area Under the Curve (AUC) between Bayesian and Multiple Imputation models for the prediction of overseas travel status of campylobacteriosis cases
| Accuracy measure | Complete dataa | Missing datab | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Frequentist | Bayesian | Multiple Imputation | Bayesian | |||||||
| 10 % | 50 % | 65 % | 80 % | 10 % | 50 % | 65 % | 80 % | |||
| Brier Score | 0.062 | 0.062 | 0.067 | 0.24 | 0.18 | 0.19 | 0.062 | 0.063 | 0.062 | 0.063 |
| AUCc | 0.67 | 0.67 | 0.64 | 0.49 | 0.42 | 0.49 | 0.67 | 0.67 | 0.65 | 0.64 |
a n = 44,285
bFour categories of artificially introduced missing data (10 %, 50 %, 65 % and 80 % missing overseas travel status)
cArea Under the Receiver Operating Characteristic Curve
Summary of logistic regression coefficients for the original dataset containing missing observations (n = 116,721) and the Complete Cases dataset (n = 44,285) using Bayesian models
| Coefficients | Original dataseta | Complete Casesb | ||||
|---|---|---|---|---|---|---|
| Mean | 95 % CI2 | Mean | 95 % CI | |||
| Intercept | −6.503 | −6.965 | −6.041 | −6.522 | −6.978 | −6.070 |
| Urbanc | 0.804 | 0.231 | 1.377 | 0.834 | 0.297 | 1.414 |
| DepIndexd | 0.091 | 0.063 | 0.119 | 0.091 | 0.063 | 0.120 |
| Travel Ratee | 0.045 | 0.040 | 0.051 | 0.045 | 0.039 | 0.050 |
| Age (5–19) | 0.473 | 0.262 | 0.683 | 0.476 | 0.270 | 0.680 |
| Age (20–59) | 1.273 | 1.095 | 1.452 | 1.278 | 1.105 | 1.449 |
| Age (60+) | 0.885 | 0.688 | 1.082 | 0.889 | 0.697 | 1.080 |
| Summer | −0.393 | −0.491 | −0.294 | −0.393 | −0.491 | −0.297 |
| Autumn | −0.254 | −0.364 | −0.143 | −0.255 | −0.367 | −0.145 |
| Winter | 0.128 | 0.027 | 0.230 | 0.128 | 0.026 | 0.229 |
| Male | 0.015 | −0.060 | 0.090 | 0.015 | −0.059 | 0.089 |
| Interventionf | 0.288 | 0.200 | 0.377 | 0.287 | 0.199 | 0.377 |
aAll campylobacteriosis notifications available for analysis (n = 116,271); bcampylobacteriosis notifications containing information on overseas travel status (n = 44,285). c Proportion of DHB population under urban influence; dDeprivation index (scale 0–10, 0 = least deprived and 10 = most deprived DHB); eShort term international travel per 100 residents of a DHB; fA binary indicator variable to identify pre and post 2006 intervention. Age (<5), Spring, and Female sex are reference categories
Fig. 5The total number of campylobacteriosis notification (upper panel) and the proportion of travel related cases predicted by the Bayesian model (lower panel) for each DHB of New Zealand (2008–2010). Notes: (1) Bottom panel: proportion of travel related cases predicted by the Bayesian model. The error bars are 95 % Credibility Intervals of the proportion of overseas travel. (2) The dashed horizontal line is the proportion of travel related campylobacteriosis cases for which travel history is available nationally (7 %)