Literature DB >> 21045072

Biological impact of missing-value imputation on downstream analyses of gene expression profiles.

Sunghee Oh1, Dongwan D Kang, Guy N Brock, George C Tseng.   

Abstract

MOTIVATION: Microarray experiments frequently produce multiple missing values (MVs) due to flaws such as dust, scratches, insufficient resolution or hybridization errors on the chips. Unfortunately, many downstream algorithms require a complete data matrix. The motivation of this work is to determine the impact of MV imputation on downstream analysis, and whether ranking of imputation methods by imputation accuracy correlates well with the biological impact of the imputation.
METHODS: Using eight datasets for differential expression (DE) and classification analysis and eight datasets for gene clustering, we demonstrate the biological impact of missing-value imputation on statistical downstream analyses, including three commonly employed DE methods, four classifiers and three gene-clustering methods. Correlation between the rankings of imputation methods based on three root-mean squared error (RMSE) measures and the rankings based on the downstream analysis methods was used to investigate which RMSE measure was most consistent with the biological impact measures, and which downstream analysis methods were the most sensitive to the choice of imputation procedure.
RESULTS: DE was the most sensitive to the choice of imputation procedure, while classification was the least sensitive and clustering was intermediate between the two. The logged RMSE (LRMSE) measure had the highest correlation with the imputation rankings based on the DE results, indicating that the LRMSE is the best representative surrogate among the three RMSE-based measures. Bayesian principal component analysis and least squares adaptive appeared to be the best performing methods in the empirical downstream evaluation.

Entities:  

Mesh:

Year:  2010        PMID: 21045072      PMCID: PMC3008641          DOI: 10.1093/bioinformatics/btq613

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  39 in total

1.  Widespread aneuploidy revealed by DNA microarray expression profiling.

Authors:  T R Hughes; C J Roberts; H Dai; A R Jones; M R Meyer; D Slade; J Burchard; S Dow; T R Ward; M J Kidd; S H Friend; M J Marton
Journal:  Nat Genet       Date:  2000-07       Impact factor: 38.330

2.  DNA microarray data imputation and significance analysis of differential expression.

Authors:  Rebecka Jörnsten; Hui-Yu Wang; William J Welsh; Ming Ouyang
Journal:  Bioinformatics       Date:  2005-08-23       Impact factor: 6.937

3.  Effects of replacing the unreliable cDNA microarray measurements on the disease classification based on gene expression profiles and functional modules.

Authors:  Dong Wang; Yingli Lv; Zheng Guo; Xia Li; Yanhui Li; Jing Zhu; Da Yang; Jianzhen Xu; Chenguang Wang; Shaoqi Rao; Baofeng Yang
Journal:  Bioinformatics       Date:  2006-06-29       Impact factor: 6.937

4.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays.

Authors:  U Alon; N Barkai; D A Notterman; K Gish; S Ybarra; D Mack; A J Levine
Journal:  Proc Natl Acad Sci U S A       Date:  1999-06-08       Impact factor: 11.205

5.  Cluster analysis and display of genome-wide expression patterns.

Authors:  M B Eisen; P T Spellman; P O Brown; D Botstein
Journal:  Proc Natl Acad Sci U S A       Date:  1998-12-08       Impact factor: 11.205

6.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization.

Authors:  P T Spellman; G Sherlock; M Q Zhang; V R Iyer; K Anders; M B Eisen; P O Brown; D Botstein; B Futcher
Journal:  Mol Biol Cell       Date:  1998-12       Impact factor: 4.138

7.  Gene expression profiling identifies clinically relevant subtypes of prostate cancer.

Authors:  Jacques Lapointe; Chunde Li; John P Higgins; Matt van de Rijn; Eric Bair; Kelli Montgomery; Michelle Ferrari; Lars Egevad; Walter Rayford; Ulf Bergerheim; Peter Ekman; Angelo M DeMarzo; Robert Tibshirani; David Botstein; Patrick O Brown; James D Brooks; Jonathan R Pollack
Journal:  Proc Natl Acad Sci U S A       Date:  2004-01-07       Impact factor: 11.205

8.  Gene expression correlates of clinical prostate cancer behavior.

Authors:  Dinesh Singh; Phillip G Febbo; Kenneth Ross; Donald G Jackson; Judith Manola; Christine Ladd; Pablo Tamayo; Andrew A Renshaw; Anthony V D'Amico; Jerome P Richie; Eric S Lander; Massimo Loda; Philip W Kantoff; Todd R Golub; William R Sellers
Journal:  Cancer Cell       Date:  2002-03       Impact factor: 31.743

9.  Which missing value imputation method to use in expression profiles: a comparative study and two selection schemes.

Authors:  Guy N Brock; John R Shaffer; Richard E Blakesley; Meredith J Lotz; George C Tseng
Journal:  BMC Bioinformatics       Date:  2008-01-10       Impact factor: 3.169

10.  Influence of microarrays experiments missing values on the stability of gene groups by hierarchical clustering.

Authors:  Alexandre G de Brevern; Serge Hazout; Alain Malpertuy
Journal:  BMC Bioinformatics       Date:  2004-08-23       Impact factor: 3.169

View more
  23 in total

1.  Selected Reaction Monitoring Mass Spectrometry for Absolute Protein Quantification.

Authors:  Nathan P Manes; Jessica M Mann; Aleksandra Nita-Lazar
Journal:  J Vis Exp       Date:  2015-08-17       Impact factor: 1.355

2.  Comparative analysis of codon usage bias in Crenarchaea and Euryarchaea genome reveals differential preference of synonymous codons to encode highly expressed ribosomal and RNA polymerase proteins.

Authors:  Vishwa Jyoti Baruah; Siddhartha Sankar Satapathy; Bhesh Raj Powdel; Rocktotpal Konwarh; Alak Kumar Buragohain; Suvendra Kumar Ray
Journal:  J Genet       Date:  2016-09       Impact factor: 1.166

3.  Imputing missing RNA-sequencing data from DNA methylation by using a transfer learning-based neural network.

Authors:  Xiang Zhou; Hua Chai; Huiying Zhao; Ching-Hsing Luo; Yuedong Yang
Journal:  Gigascience       Date:  2020-07-01       Impact factor: 6.524

4.  A flexible, interpretable, and accurate approach for imputing the expression of unmeasured genes.

Authors:  Christopher A Mancuso; Jacob L Canfield; Deepak Singla; Arjun Krishnan
Journal:  Nucleic Acids Res       Date:  2020-12-02       Impact factor: 16.971

Review 5.  Review, evaluation, and discussion of the challenges of missing value imputation for mass spectrometry-based label-free global proteomics.

Authors:  Bobbie-Jo M Webb-Robertson; Holli K Wiberg; Melissa M Matzke; Joseph N Brown; Jing Wang; Jason E McDermott; Richard D Smith; Karin D Rodland; Thomas O Metz; Joel G Pounds; Katrina M Waters
Journal:  J Proteome Res       Date:  2015-04-22       Impact factor: 4.466

6.  Missing value imputation in high-dimensional phenomic data: imputable or not, and how?

Authors:  Serena G Liao; Yan Lin; Dongwan D Kang; Divay Chandra; Jessica Bon; Naftali Kaminski; Frank C Sciurba; George C Tseng
Journal:  BMC Bioinformatics       Date:  2014-11-05       Impact factor: 3.169

7.  Normalization and missing value imputation for label-free LC-MS analysis.

Authors:  Yuliya V Karpievitch; Alan R Dabney; Richard D Smith
Journal:  BMC Bioinformatics       Date:  2012-11-05       Impact factor: 3.169

8.  MmPalateMiRNA, an R package compendium illustrating analysis of miRNA microarray data.

Authors:  Guy N Brock; Partha Mukhopadhyay; Vasyl Pihur; Cynthia Webb; Robert M Greene; M Michele Pisano
Journal:  Source Code Biol Med       Date:  2013-01-08

9.  Bayesian integrative model for multi-omics data with missingness.

Authors:  Zhou Fang; Tianzhou Ma; Gong Tang; Li Zhu; Qi Yan; Ting Wang; Juan C Celedón; Wei Chen; George C Tseng
Journal:  Bioinformatics       Date:  2018-11-15       Impact factor: 6.931

Review 10.  Bioinformatic Analysis of Temporal and Spatial Proteome Alternations During Infections.

Authors:  Matineh Rahmatbakhsh; Alla Gagarinova; Mohan Babu
Journal:  Front Genet       Date:  2021-07-02       Impact factor: 4.599

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.