| Literature DB >> 27729945 |
Aidan G O'Keeffe1, Daniel M Farewell2, Brian D M Tom3, Vernon T Farewell3.
Abstract
In longitudinal randomised trials and observational studies within a medical context, a composite outcome-which is a function of several individual patient-specific outcomes-may be felt to best represent the outcome of interest. As in other contexts, missing data on patient outcome, due to patient drop-out or for other reasons, may pose a problem. Multiple imputation is a widely used method for handling missing data, but its use for composite outcomes has been seldom discussed. Whilst standard multiple imputation methodology can be used directly for the composite outcome, the distribution of a composite outcome may be of a complicated form and perhaps not amenable to statistical modelling. We compare direct multiple imputation of a composite outcome with separate imputation of the components of a composite outcome. We consider two imputation approaches. One approach involves modelling each component of a composite outcome using standard likelihood-based models. The other approach is to use linear increments methods. A linear increments approach can provide an appealing alternative as assumptions concerning both the missingness structure within the data and the imputation models are different from the standard likelihood-based approach. We compare both approaches using simulation studies and data from a randomised trial on early rheumatoid arthritis patients. Results suggest that both approaches are comparable and that for each, separate imputation offers some improvement on the direct imputation of a composite outcome.Entities:
Keywords: Composite outcome; Linear increments; Longitudinal data; Missing data; Multiple imputation
Year: 2016 PMID: 27729945 PMCID: PMC5035329 DOI: 10.1007/s12561-016-9146-z
Source DB: PubMed Journal: Stat Biosci ISSN: 1867-1764
Table showing the composition of the missing portions of each dataset with respect to treatment group
| Treatment group | No. of patients exhibiting missing outcomes in datasets A–E | ||||
|---|---|---|---|---|---|
| A | B | C | D | E | |
| None | 16 | 17 | 17 | 17 | 16 |
| CSP | 16 | 15 | 15 | 15 | 15 |
| PDN | 17 | 17 | 17 | 17 | 17 |
| Both | 18 | 18 | 18 | 18 | 18 |
| Total | 67 | 67 | 67 | 67 | 66 |
Each dataset (A–E) features approximately 20 % of outcomes missing at 24 months (and, secondly, at 12, 18 and 24 months)
Table showing the average linear predictor estimates, together with associated standard errors, and estimates of for the different treatment groups using both the additive and interaction estimation models
| Treatment group | Imputation method | |||
|---|---|---|---|---|
| MLE | LI | DIRECT | TRUE Data | |
| Additive Model: Average Linear Predictor Estimate (Standard error) | ||||
| None |
|
|
|
|
| CSP |
|
|
|
|
| PDN |
|
|
|
|
| Both |
|
|
|
|
| Additive Model: Average Estimate of | ||||
| None | 0.361 | 0.356 | 0.368 | 0.374 |
| CSP | 0.406 | 0.405 | 0.424 | 0.420 |
| PDN | 0.384 | 0.374 | 0.382 | 0.388 |
| Both | 0.430 | 0.425 | 0.439 | 0.434 |
| Interaction Model: Average Linear Predictor Estimate (Standard error) | ||||
| None |
|
|
|
|
| CSP |
|
|
|
|
| PDN |
|
|
|
|
| Both |
|
|
|
|
| Interaction Model: Average Estimate of | ||||
| None | 0.408 | 0.402 | 0.416 | 0.422 |
| CSP | 0.355 | 0.355 | 0.371 | 0.368 |
| PDN | 0.338 | 0.329 | 0.335 | 0.341 |
| Both | 0.473 | 0.467 | 0.483 | 0.478 |
Results are shown where multiple imputation was performed for all outcomes using maximum likelihood estimation (MLE), for all outcomes using linear increments (LI), direct imputation of using ML (DIRECT) and estimates produced using data prior to the application of a missingness structure (TRUE Data). Missing data occurred at 24 months only
Table summarising the differences between the imputed values and the true for each imputation method, for those cases where outcomes were missing at 24 months across the ten multiple imputation runs
| Difference | Imputation method | ||
|---|---|---|---|
|
| MLE | LI | DIRECT |
|
| 547 | 614 | 581 |
| 0 | 2387 | 2344 | 2189 |
| +1 | 406 | 382 | 570 |
Fig. 1Histograms of the variables at 24 months used to calculate . ‘TRUE’ denotes the true values, ‘MLE’ denotes values imputed by ML-based models and ‘LI’ denotes values imputed using linear increments models. Missingness was simulated at 24 months only prior to multiple imputation
Table showing the average linear predictor estimates, together with associated standard errors, and estimates of for the different treatment groups using both the additive and interaction estimation models
| Treatment group | Imputation method | |||
|---|---|---|---|---|
| MLE | LI | DIRECT | TRUE Data | |
| Additive Model: average linear predictor estimate (standard error) | ||||
| None |
|
|
|
|
| CSP |
|
|
|
|
| PDN |
|
|
|
|
| Both |
|
|
|
|
| Additive Model: average estimate of | ||||
| None | 0.375 | 0.358 | 0.372 | 0.374 |
| CSP | 0.399 | 0.401 | 0.416 | 0.420 |
| PDN | 0.403 | 0.373 | 0.384 | 0.388 |
| Both | 0.427 | 0.417 | 0.429 | 0.434 |
| Interaction Model: average linear predictor estimate (standard error) | ||||
| None |
|
|
|
|
| CSP |
|
|
|
|
| PDN |
|
|
|
|
| Both |
|
|
|
|
| Interaction Model: average estimate of | ||||
| None | 0.419 | 0.399 | 0.422 | 0.422 |
| CSP | 0.352 | 0.356 | 0.361 | 0.368 |
| PDN | 0.360 | 0.333 | 0.335 | 0.341 |
| Both | 0.467 | 0.455 | 0.475 | 0.478 |
Results are shown where multiple imputation was performed for all outcomes using maximum likelihood estimation (MLE), for all outcomes using linear increments (LI), direct imputation of using ML (DIRECT) and estimates produced using data prior to the application of a missingness structure (TRUE Data). Missing data occurred at 12, 18 and 24 months
Table summarising the differences between the imputed values and the true for each imputation method, for those cases where outcomes were missing at 12, 18 and 24 months across the ten multiple imputation runs
| Difference | Imputation method | ||
|---|---|---|---|
|
| MLE | LI | DIRECT |
|
| 651 | 787 | 788 |
| 0 | 2120 | 2053 | 1826 |
| +1 | 569 | 506 | 726 |
Fig. 2Histograms of the variables at 24 months used to calculate . ‘TRUE’ denotes the true values, ‘MLE’ denotes values imputed by ML-based models and ‘LI’ denotes values imputed using linear increments models. Missingness was simulated at 12, 18 and 24 months, prior to multiple imputation
Table showing the average linear predictor estimates, together with associated standard errors, and estimates of for the different treatment groups using both the additive and interaction estimation models
| Group | Imputation Method | ||||
|---|---|---|---|---|---|
| MLE | LI | CHAINED | DIRECT | COMPLETE | |
| Additive Model: linear predictor estimate (standard error) | |||||
| None |
|
|
|
|
|
| CSP |
|
|
|
|
|
| PDN |
|
|
|
|
|
| Both |
|
|
|
|
|
| Additive Model: estimate of | |||||
| None | 0.353 | 0.349 | 0.356 | 0.365 | 0.368 |
| CSP | 0.382 | 0.381 | 0.389 | 0.406 | 0.413 |
| PDN | 0.375 | 0.382 | 0.367 | 0.382 | 0.376 |
| Both | 0.405 | 0.400 | 0.416 | 0.424 | 0.421 |
| Interaction Model: linear predictor estimate (standard error) | |||||
| None |
|
|
|
|
|
| CSP |
|
|
|
|
|
| PDN |
|
|
|
|
|
| Both |
|
|
|
|
|
| Interaction Model: estimate of | |||||
| None | 0.391 | 0.379 | 0.395 | 0.393 | 0.402 |
| CSP | 0.344 | 0.351 | 0.351 | 0.379 | 0.378 |
| PDN | 0.335 | 0.336 | 0.343 | 0.354 | 0.344 |
| Both | 0.444 | 0.430 | 0.454 | 0.452 | 0.454 |
Multiple imputation has been used to predict actual missing values from the CARDERA trial. Results are shown where multiple imputation was performed for all outcomes using maximum likelihood estimate (MLE), for all outcomes using linear increments (LI), for all outcomes using a chained equations approach (CHAINED) and via direct imputation of using maximum likelihood (DIRECT). As a comparison, results using complete cases only (COMPLETE) are also shown
Fig. 3Plot of sample mean estimates of at each time point (6, 12, 18 and 24 months). Results are shown where multiple imputation has been performed using the ML method (MLE), linear increments method (LI), chained equations method (CHAINED) and direct imputation method (DIRECT). Results for the complete cases data (Complete) are also shown. Separate plots are shown for each of the four trial arms