Literature DB >> 35169788

Multiple Imputation via Generative Adversarial Network for High-dimensional Blockwise Missing Value Problems.

Zongyu Dai1, Zhiqi Bu1, Qi Long2.   

Abstract

Missing data are present in most real world problems and need careful handling to preserve the prediction accuracy and statistical consistency in the downstream analysis. As the gold standard of handling missing data, multiple imputation (MI) methods are proposed to account for the imputation uncertainty and provide proper statistical inference. In this work, we propose Multiple Imputation via Generative Adversarial Network (MI-GAN), a deep learning-based (in specific, a GAN-based) multiple imputation method, that can work under missing at random (MAR) mechanism with theoretical support. MI-GAN leverages recent progress in conditional generative adversarial neural works and shows strong performance matching existing state-of-the-art imputation methods on high-dimensional datasets, in terms of imputation error. In particular, MI-GAN significantly outperforms other imputation methods in the sense of statistical inference and computational speed.

Entities:  

Keywords:  GAN; missing at random; missing data imputation; multiple imputation; neural network

Year:  2021        PMID: 35169788      PMCID: PMC8841955          DOI: 10.1109/icmla52953.2021.00131

Source DB:  PubMed          Journal:  Proc Int Conf Mach Learn Appl


  8 in total

1.  MissForest--non-parametric missing value imputation for mixed-type data.

Authors:  Daniel J Stekhoven; Peter Bühlmann
Journal:  Bioinformatics       Date:  2011-10-28       Impact factor: 6.937

2.  Matrix Completion and Low-Rank SVD via Fast Alternating Least Squares.

Authors:  Trevor Hastie; Rahul Mazumder; Jason D Lee; Reza Zadeh
Journal:  J Mach Learn Res       Date:  2015       Impact factor: 3.654

3.  An imputation-regularized optimization algorithm for high dimensional missing data problems and beyond.

Authors:  Faming Liang; Bochao Jia; Jingnan Xue; Qizhai Li; Ye Luo
Journal:  J R Stat Soc Series B Stat Methodol       Date:  2018-06-25       Impact factor: 4.488

4.  Multiple imputation of discrete and continuous data by fully conditional specification.

Authors:  Stef van Buuren
Journal:  Stat Methods Med Res       Date:  2007-06       Impact factor: 3.021

5.  Multiple imputation in the presence of high-dimensional data.

Authors:  Yize Zhao; Qi Long
Journal:  Stat Methods Med Res       Date:  2013-11-25       Impact factor: 3.021

6.  Spectral Regularization Algorithms for Learning Large Incomplete Matrices.

Authors:  Rahul Mazumder; Trevor Hastie; Robert Tibshirani
Journal:  J Mach Learn Res       Date:  2010-03-01       Impact factor: 3.654

7.  Multiple Imputation for General Missing Data Patterns in the Presence of High-dimensional Data.

Authors:  Yi Deng; Changgee Chang; Moges Seyoum Ido; Qi Long
Journal:  Sci Rep       Date:  2016-02-12       Impact factor: 4.379

8.  Inference and uncertainty quantification for noisy matrix completion.

Authors:  Yuxin Chen; Jianqing Fan; Cong Ma; Yuling Yan
Journal:  Proc Natl Acad Sci U S A       Date:  2019-10-30       Impact factor: 11.205

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.