| Literature DB >> 23067272 |
Dan Jackson1, Dan Mason, Ian R White, Stephen Sutton.
Abstract
BACKGROUND: Missing outcome data are very common in smoking cessation trials. It is often assumed that all such missing data are from participants who have been unsuccessful in giving up smoking ("missing=smoking"). Here we use data from a recent Internet based smoking cessation trial in order to investigate which of a set of a priori chosen baseline variables are predictive of missingness, and the evidence for and against the "missing=smoking" assumption.Entities:
Mesh:
Year: 2012 PMID: 23067272 PMCID: PMC3507670 DOI: 10.1186/1471-2288-12-157
Source DB: PubMed Journal: BMC Med Res Methodol ISSN: 1471-2288 Impact factor: 4.615
The pattern of missingness for the primary trial outcome Y
| | |||
|---|---|---|---|
| Not abstained ( | 271 | 289 | 560 |
| Abstained ( | 80 | 82 | 162 |
| Missing | 526 | 510 | 1036 |
| Total | 877 | 881 | 1758 |
Baseline covariates
| Treatment | Binary | No | Indicator for treatment group | 877/1758 treated | |
| Age | Continuous | No | Age in years | 38 (11) | |
| Sex | Binary | No | Indicator for a female participant | 1126/1758 female | |
| Qualifications | Categorical | No | Educational qualifications: 1=None; 2=GCSE | 3.06 (1.16) | |
| | | | | 3=A-level; 4=Undergraduate Degree; 5=Postgraduate Degree | |
| Deprivation | Categorical | No | Deprivation score, range 0-5, higher indicates more deprived | 1.21 (1.09) | |
| Conscientiousness | Continuous | No | Conscientiousness score, range 1-5, higher | 3.31 (0.84) | |
| | | | | indicates more conscientiousness (takes values between 1-5, in steps of 0.25, and so is not truly continuous). | |
| Determination | Categorical | Yes | Determination to quit: 1=not at all; 5=extremely | 4.30 (0.75) | |
| Support | Categorical | Yes | Does the participant feel supported by family and friends: 1=not at all; 5=extremely | 3.31 (1.23) | |
| Dependence | Categorical | Yes | Cigarette dependence score, range 1-8, | 5.48 (1.57) | |
| | | | | higher indicates greater dependence | |
| Previous | Binary | Yes | Indicator for not having managed to quit previously | 907/1758 have |
Summary statistics are shown where means and standard deviations (in parentheses) are given for the continuous and categorical variables.
The outcome Y by the number of contact attempts and trial arm
| One phone call | 217 | 214 | 50/214=23.4% |
| One phone call and an email | 925 | 69 | 30/69 =43.5% |
| Two phone calls | 134 | 132 | 23/132=17.4% |
| Two phone calls and an email | 6 | 1 | 0/1=0% |
| Three phone calls | 93 | 89 | 16/89=18.0% |
| Three phone calls and an email | 3 | 0 | 0/0 |
| Four phone calls | 59 | 57 | 10/57=17.5% |
| Four phone calls and an email | 4 | 0 | 0/0 |
| Five phone calls | 48 | 45 | 9/45=20.0% |
| Five phone calls and an email | 8 | 0 | 0/0 |
| Six phone calls | 19 | 19 | 1/19=5.3% |
| Six phone calls and an email | 2 | 1 | 1/1=100% |
| Seven phone calls | 38 | 38 | 5/38=13.2% |
| Seven phone calls and an email | 2 | 0 | 0/0 |
| Eight phone calls | 17 | 16 | 3/16=18.8% |
| Eight phone calls and an email | 5 | 1 | 1/1=100% |
| Nine phone calls | 11 | 10 | 2/10=20% |
| Nine phone calls and an email | 2 | 0 | 0/0 |
| Ten phone calls | 19 | 19 | 6/19=31.6% |
| Ten phone calls and an email | 146 | 11 | 5/11=45.5% |
The fraction and percentage of participants who successfully quit smoking (Y=1) are tabulated by the number of contact attempts (telephone calls and email). Participants received up to ten telephone calls and up to one email attempt.
The results from the sensitivity analysis
| (
| (
| (
| (
| (
| (
| (
| (
| (
| |
|---|---|---|---|---|---|---|---|---|---|
| -0.073(0.163) | -0.072(0.149) | -0.081(0.127) | -0.088(0.106) | -0.091(0.099) | -0.094(0.102) | -0.095(0.104) | -0.097(0.106) | -0.097(0.107) | |
| 0.005(0.172) | 0.036(0.157) | 0.070(0.133) | 0.102(0.111) | 0.131(0.104) | 0.152(0.107) | 0.165(0.110) | 0.171(0.111) | 0.174(0.112) | |
| 0.072(0.077) | 0.073(0.070) | 0.072(0.061) | 0.071(0.051) | 0.072(0.047) | 0.073(0.049) | 0.071(0.050) | 0.069(0.050) | 0.067(0.050) | |
| -0.066 (0.051) | -0.036(0.053) | -0.024(0.054) | -0.020(0.054) | -0.018(0.054) | |||||
| -0.105(0.060) | -0.084(0.062) | -0.074(0.063) | -0.069(0.064) | -0.067(0.064) | |||||
| 0.153(0.087) | 0.089(0.073) | 0.034(0.067) | 0.001(0.069) | -0.015(0.071) | -0.022(0.071) | -0.024(0.072) | |||
| 0.131(0.067) | 0.069(0.041) | 0.057(0.042) | 0.051(0.043) | 0.049(0.044) | 0.048(0.044) | ||||
| 0.059(0.059) | 0.055(0.052) | 0.039(0.043) | 0.016(0.036) | -0.001(0.034) | -0.011(0.035) | -0.015(0.035) | -0.017(0.036) | -0.018(0.036) | |
| -0.020(0.169) | -0.041(0.153) | -0.084(0.131) | -0.138(0.109) | -0.183(0.103) |
The coefficients β1,1 to β1,10 describe the effect of each of the ten baseline covariates in Table 2. The tabulated P = P(Y = 1|R = 0,x) are obtained from equation (4) with logit(P(Y = 1|R = 1,x)) = logit(0.22)and the corresponding value of β2. Statistically significant estimates, at the 5% level, are shown in bold and standard errors are in parentheses.
Estimates from the full selection model
| -0.095 | 0.103 | |
| 0.026 | 0.006 | |
| 0.160 | 0.118 | |
| 0.072 | 0.049 | |
| -0.028 | 0.069 | |
| -0.077 | 0.073 | |
| -0.010 | 0.084 | |
| 0.053 | 0.046 | |
| -0.014 | 0.037 | |
| -0.219 | 0.114 | |
| 1.561 | 3.535 |
Despite our reservations, and the criticisms in literature of attempting this, the full MNAR selection model is identifiable and resulted in the following parameter estimates. See section Fitting the full model for a discussion of the difficulties in fitting the MNAR model.
Estimates from the model for the repeated attempts
| -0.099 | 0.081 | -0.103 | 0.081 | |
| 0.021 | 0.004 | 0.021 | 0.004 | |
| 0.103 | 0.086 | 0.110 | 0.087 | |
| 0.034 | 0.040 | 0.033 | 0.040 | |
| -0.026 | 0.043 | -0.018 | 0.044 | |
| -0.009 | 0.050 | -0.001 | 0.050 | |
| -0.002 | 0.055 | -0.008 | 0.055 | |
| 0.096 | 0.034 | 0.093 | 0.034 | |
| 0.020 | 0.027 | 0.018 | 0.028 | |
| -0.072 | 0.085 | -0.076 | 0.085 | |
| - | - | 0.215 | 0.222 |
The model for the repeated attempts incorporates more data, and hence makes more assumptions, but provides much more satisfactory estimation of β2, and hence the role of the outcome Y in the missing data model, than the selection model.
Further estimates of from repeated attempts modelling
| Smoking related | All | 0.176 (0.228) |
| Non-Smoking related | All | 0.258 (0.217) |
| None | All | 0.245 (0.220) |
| All | Younger (36 and under) | -0.013 (0.606) |
| All | Older (37 and over) | 0.333 (0.258) |
| All (except sex) | Men | -0.302 (0.738) |
| All (except sex) | Women | 0.359 (0.240) |
| All | Those who receive 5 calls or less | 0.296 (0.165) |
| All plus the email attempt depends on the number of failed telephone calls | All | 0.217 (0.221) |
Estimates of β2 are shown, with standard errors in parentheses, when omitting particular combinations of covariates and participants.