Literature DB >> 24596839

Influence of pattern of missing data on performance of imputation methods: an example using national data on drug injection in prisons.

Saiedeh Haji-Maghsoudi1, Ali-Akbar Haghdoost1, Azam Rastegari2, Mohammad Reza Baneshi3.   

Abstract

BACKGROUND: Policy makers need models to be able to detect groups at high risk of HIV infection. Incomplete records and dirty data are frequently seen in national data sets. Presence of missing data challenges the practice of model development. Several studies suggested that performance of imputation methods is acceptable when missing rate is moderate. One of the issues which was of less concern, to be addressed here, is the role of the pattern of missing data.
METHODS: We used information of 2720 prisoners. RESULTS derived from fitting regression model to whole data were served as gold standard. Missing data were then generated so that 10%, 20% and 50% of data were lost. In scenario 1, we generated missing values, at above rates, in one variable which was significant in gold model (age). In scenario 2, a small proportion of each of independent variable was dropped out. Four imputation methods, under different Event Per Variable (EPV) values, were compared in terms of selection of important variables and parameter estimation.
RESULTS: In scenario 2, bias in estimates was low and performances of all methods for handing missing data were similar. All methods at all missing rates were able to detect significance of age. In scenario 1, biases in estimations were increased, in particular at 50% missing rate. Here at EPVs of 10 and 5, imputation methods failed to capture effect of age.
CONCLUSION: In scenario 2, all imputation methods at all missing rates, were able to detect age as being significant. This was not the case in scenario 1. Our results showed that performance of imputation methods depends on the pattern of missing data.

Entities:  

Keywords:  Drug Injection; Expectation Maximum Algorithm; MICE; Missing Data; National Data

Year:  2013        PMID: 24596839      PMCID: PMC3937937          DOI: 10.15171/ijhpm.2013.11

Source DB:  PubMed          Journal:  Int J Health Policy Manag        ISSN: 2322-5939


  19 in total

1.  Missing covariate data in medical research: to impute is better than to ignore.

Authors:  Kristel J M Janssen; A Rogier T Donders; Frank E Harrell; Yvonne Vergouwe; Qingxia Chen; Diederick E Grobbee; Karel G M Moons
Journal:  J Clin Epidemiol       Date:  2010-03-24       Impact factor: 6.437

2.  Methods to account for attrition in longitudinal data: do they work? A simulation study.

Authors:  Vicki L Kristman; Michael Manno; Pierre Côté
Journal:  Eur J Epidemiol       Date:  2005       Impact factor: 8.082

3.  Multiple imputation: review of theory, implementation and software.

Authors:  Ofer Harel; Xiao-Hua Zhou
Journal:  Stat Med       Date:  2007-07-20       Impact factor: 2.373

Review 4.  Use of multiple imputation in the epidemiologic literature.

Authors:  Mark A Klebanoff; Stephen R Cole
Journal:  Am J Epidemiol       Date:  2008-06-30       Impact factor: 4.897

5.  Multiple imputation using chained equations: Issues and guidance for practice.

Authors:  Ian R White; Patrick Royston; Angela M Wood
Journal:  Stat Med       Date:  2010-11-30       Impact factor: 2.373

6.  Variable selection for multiply-imputed data with application to dioxin exposure study.

Authors:  Qixuan Chen; Sijian Wang
Journal:  Stat Med       Date:  2013-03-25       Impact factor: 2.373

7.  Does the missing data imputation method affect the composition and performance of prognostic models?

Authors:  M R Baneshi; A R Talei
Journal:  Iran Red Crescent Med J       Date:  2012-01-01       Impact factor: 0.611

8.  Comparison of techniques for handling missing covariate data within prognostic modelling studies: a simulation study.

Authors:  Andrea Marshall; Douglas G Altman; Patrick Royston; Roger L Holder
Journal:  BMC Med Res Methodol       Date:  2010-01-19       Impact factor: 4.615

9.  Risk factors associated with injection initiation among drug users in Northern Thailand.

Authors:  Yingkai Cheng; Susan G Sherman; Namtip Srirat; Tasanai Vongchak; Surinda Kawichai; Jaroon Jittiwutikarn; Vinai Suriyanon; Myat Htoo Razak; Teerada Sripaipan; David D Celentano
Journal:  Harm Reduct J       Date:  2006-03-14

10.  Estimation of the active network size of kermanian males.

Authors:  Mostafa Shokoohi; Mohammad Reza Baneshi; Ali Akbar Haghdoost
Journal:  Addict Health       Date:  2010 Summer-Autumn
View more
  3 in total

1.  Handling Complex Missing Data Using Random Forest Approach for an Air Quality Monitoring Dataset: A Case Study of Kuwait Environmental Data (2012 to 2018).

Authors:  Ahmad R Alsaber; Jiazhu Pan; Adeeba Al-Hurban
Journal:  Int J Environ Res Public Health       Date:  2021-02-02       Impact factor: 3.390

2.  Correlates of Alcohol Consumption and Drug Injection among Homeless Youth: A Case Study in the Southeast of Iran.

Authors:  Abolfazl Hosseinnataj; Abbas Bahrampour; Mohammad Reza Baneshi; Samira Poormorovat; Glayol Ardalan; Farzaneh Zolala; Naser Nasiri; Jasem Zarei; Ghazal Mousavian; Abedin Iranpour; Hamid Sharifi
Journal:  Addict Health       Date:  2019-10

3.  Robust Exponential Decreasing Index (REDI): adaptive and robust method for computing cumulated workload.

Authors:  Issa Moussa; Arthur Leroy; Guillaume Sauliere; Julien Schipman; Jean-François Toussaint; Adrien Sedeaud
Journal:  BMJ Open Sport Exerc Med       Date:  2019-10-30
  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.