Literature DB >> 16118262

DNA microarray data imputation and significance analysis of differential expression.

Rebecka Jörnsten1, Hui-Yu Wang, William J Welsh, Ming Ouyang.   

Abstract

MOTIVATION: Significance analysis of differential expression in DNA microarray data is an important task. Much of the current research is focused on developing improved tests and software tools. The task is difficult not only owing to the high dimensionality of the data (number of genes), but also because of the often non-negligible presence of missing values. There is thus a great need to reliably impute these missing values prior to the statistical analyses. Many imputation methods have been developed for DNA microarray data, but their impact on statistical analyses has not been well studied. In this work we examine how missing values and their imputation affect significance analysis of differential expression.
RESULTS: We develop a new imputation method (LinCmb) that is superior to the widely used methods in terms of normalized root mean squared error. Its estimates are the convex combinations of the estimates of existing methods. We find that LinCmb adapts to the structure of the data: If the data are heterogeneous or if there are few missing values, LinCmb puts more weight on local imputation methods; if the data are homogeneous or if there are many missing values, LinCmb puts more weight on global imputation methods. Thus, LinCmb is a useful tool to understand the merits of different imputation methods. We also demonstrate that missing values affect significance analysis. Two datasets, different amounts of missing values, different imputation methods, the standard t-test and the regularized t-test and ANOVA are employed in the simulations. We conclude that good imputation alleviates the impact of missing values and should be an integral part of microarray data analysis. The most competitive methods are LinCmb, GMC and BPCA. Popular imputation schemes such as SVD, row mean, and KNN all exhibit high variance and poor performance. The regularized t-test is less affected by missing values than the standard t-test. AVAILABILITY: Matlab code is available on request from the authors.

Entities:  

Mesh:

Year:  2005        PMID: 16118262     DOI: 10.1093/bioinformatics/bti638

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  27 in total

1.  Application of survival analysis methodology to the quantitative analysis of LC-MS proteomics data.

Authors:  Carmen D Tekwe; Raymond J Carroll; Alan R Dabney
Journal:  Bioinformatics       Date:  2012-05-24       Impact factor: 6.937

2.  Biological impact of missing-value imputation on downstream analyses of gene expression profiles.

Authors:  Sunghee Oh; Dongwan D Kang; Guy N Brock; George C Tseng
Journal:  Bioinformatics       Date:  2010-11-02       Impact factor: 6.937

3.  How to improve postgenomic knowledge discovery using imputation.

Authors:  Muhammad Shoaib B Sehgal; Iqbal Gondal; Laurence S Dooley; Ross Coppel
Journal:  EURASIP J Bioinform Syst Biol       Date:  2009-02-08

4.  Impact of missing value imputation on classification for DNA microarray gene expression data--a model-based study.

Authors:  Youting Sun; Ulisses Braga-Neto; Edward R Dougherty
Journal:  EURASIP J Bioinform Syst Biol       Date:  2010-03-02

5.  Data Imputation in Epistatic MAPs by Network-Guided Matrix Completion.

Authors:  Marinka Žitnik; Blaž Zupan
Journal:  J Comput Biol       Date:  2015-02-06       Impact factor: 1.479

6.  Incorporating Nonlinear Relationships in Microarray Missing Value Imputation.

Authors:  Tianwei Yu; Hesen Peng; Wei Sun
Journal:  IEEE/ACM Trans Comput Biol Bioinform       Date:  2011 May-Jun       Impact factor: 3.710

7.  A study on the predictability of acute lymphoblastic leukaemia response to treatment using a hybrid oncosimulator.

Authors:  Eleftherios Ouzounoglou; Eleni Kolokotroni; Martin Stanulla; Georgios S Stamatakos
Journal:  Interface Focus       Date:  2017-12-15       Impact factor: 3.906

8.  Which missing value imputation method to use in expression profiles: a comparative study and two selection schemes.

Authors:  Guy N Brock; John R Shaffer; Richard E Blakesley; Meredith J Lotz; George C Tseng
Journal:  BMC Bioinformatics       Date:  2008-01-10       Impact factor: 3.169

Review 9.  A Review of Imputation Strategies for Isobaric Labeling-Based Shotgun Proteomics.

Authors:  Lisa M Bramer; Jan Irvahn; Paul D Piehowski; Karin D Rodland; Bobbie-Jo M Webb-Robertson
Journal:  J Proteome Res       Date:  2020-09-25       Impact factor: 4.466

10.  Comparative analysis of missing value imputation methods to improve clustering and interpretation of microarray experiments.

Authors:  Magalie Celton; Alain Malpertuy; Gaëlle Lelandais; Alexandre G de Brevern
Journal:  BMC Genomics       Date:  2010-01-07       Impact factor: 3.969

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.