Literature DB >> 20875726

Missing value imputation on missing completely at random data using multilayer perceptrons.

Esther-Lydia Silva-Ramírez1, Rafael Pino-Mejías, Manuel López-Coello, María-Dolores Cubiles-de-la-Vega.   

Abstract

Data mining is based on data files which usually contain errors in the form of missing values. This paper focuses on a methodological framework for the development of an automated data imputation model based on artificial neural networks. Fifteen real and simulated data sets are exposed to a perturbation experiment, based on the random generation of missing values. These data set sizes range from 47 to 1389 records. A perturbation experiment was performed for each data set where the probability of missing value was set to 0.05. Several architectures and learning algorithms for the multilayer perceptron are tested and compared with three classic imputation procedures: mean/mode imputation, regression and hot-deck. The obtained results, considering different performance measures, not only suggest this approach improves the quality of a database with missing values, but also the best results are clearly obtained using the Multilayer Perceptron model in data sets with categorical variables. Three learning rules (Levenberg-Marquardt, BFGS Quasi-Newton and Conjugate Gradient Fletcher-Reeves Update) and a small number of hidden nodes are recommended.
Copyright © 2010 Elsevier Ltd. All rights reserved.

Entities:  

Mesh:

Year:  2010        PMID: 20875726     DOI: 10.1016/j.neunet.2010.09.008

Source DB:  PubMed          Journal:  Neural Netw        ISSN: 0893-6080


  3 in total

1.  An efficient ensemble method for missing value imputation in microarray gene expression data.

Authors:  Xinshan Zhu; Jiayu Wang; Biao Sun; Chao Ren; Ting Yang; Jie Ding
Journal:  BMC Bioinformatics       Date:  2021-04-13       Impact factor: 3.169

2.  Explainable machine learning for knee osteoarthritis diagnosis based on a novel fuzzy feature selection methodology.

Authors:  Christos Kokkotis; Charis Ntakolia; Serafeim Moustakidis; Giannis Giakas; Dimitrios Tsaopoulos
Journal:  Phys Eng Sci Med       Date:  2022-01-31

3.  Kernel Sparse Representation with Hybrid Regularization for On-Road Traffic Sensor Data Imputation.

Authors:  Xiaobo Chen; Cheng Chen; Yingfeng Cai; Hai Wang; Qiaolin Ye
Journal:  Sensors (Basel)       Date:  2018-08-31       Impact factor: 3.576

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.