Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Biological impact of missing-value imputation on downstream analyses of gene expression profiles.

Literature DB >> 21045072

Biological impact of missing-value imputation on downstream analyses of gene expression profiles.

Sunghee Oh¹, Dongwan D Kang, Guy N Brock, George C Tseng.

Abstract

MOTIVATION: Microarray experiments frequently produce multiple missing values (MVs) due to flaws such as dust, scratches, insufficient resolution or hybridization errors on the chips. Unfortunately, many downstream algorithms require a complete data matrix. The motivation of this work is to determine the impact of MV imputation on downstream analysis, and whether ranking of imputation methods by imputation accuracy correlates well with the biological impact of the imputation.
METHODS: Using eight datasets for differential expression (DE) and classification analysis and eight datasets for gene clustering, we demonstrate the biological impact of missing-value imputation on statistical downstream analyses, including three commonly employed DE methods, four classifiers and three gene-clustering methods. Correlation between the rankings of imputation methods based on three root-mean squared error (RMSE) measures and the rankings based on the downstream analysis methods was used to investigate which RMSE measure was most consistent with the biological impact measures, and which downstream analysis methods were the most sensitive to the choice of imputation procedure.
RESULTS: DE was the most sensitive to the choice of imputation procedure, while classification was the least sensitive and clustering was intermediate between the two. The logged RMSE (LRMSE) measure had the highest correlation with the imputation rankings based on the DE results, indicating that the LRMSE is the best representative surrogate among the three RMSE-based measures. Bayesian principal component analysis and least squares adaptive appeared to be the best performing methods in the empirical downstream evaluation.

Entities: Disease

Mesh：

Year: 2010 PMID： 21045072 PMCID： PMC3008641 DOI： 10.1093/bioinformatics/btq613

Source DB: PubMed Journal: Bioinformatics ISSN： 1367-4803 Impact factor: 6.937

39 in total

1. Widespread aneuploidy revealed by DNA microarray expression profiling.

Authors: T R Hughes; C J Roberts; H Dai; A R Jones; M R Meyer; D Slade; J Burchard; S Dow; T R Ward; M J Kidd; S H Friend; M J Marton
Journal: Nat Genet Date: 2000-07 Impact factor: 38.330

2. DNA microarray data imputation and significance analysis of differential expression.

Authors: Rebecka Jörnsten; Hui-Yu Wang; William J Welsh; Ming Ouyang
Journal: Bioinformatics Date: 2005-08-23 Impact factor: 6.937

3. Effects of replacing the unreliable cDNA microarray measurements on the disease classification based on gene expression profiles and functional modules.

Authors: Dong Wang; Yingli Lv; Zheng Guo; Xia Li; Yanhui Li; Jing Zhu; Da Yang; Jianzhen Xu; Chenguang Wang; Shaoqi Rao; Baofeng Yang
Journal: Bioinformatics Date: 2006-06-29 Impact factor: 6.937

4. Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays.

Authors: U Alon; N Barkai; D A Notterman; K Gish; S Ybarra; D Mack; A J Levine
Journal: Proc Natl Acad Sci U S A Date: 1999-06-08 Impact factor: 11.205

5. Cluster analysis and display of genome-wide expression patterns.

Authors: M B Eisen; P T Spellman; P O Brown; D Botstein
Journal: Proc Natl Acad Sci U S A Date: 1998-12-08 Impact factor: 11.205

6. Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization.

Authors: P T Spellman; G Sherlock; M Q Zhang; V R Iyer; K Anders; M B Eisen; P O Brown; D Botstein; B Futcher
Journal: Mol Biol Cell Date: 1998-12 Impact factor: 4.138

7. Gene expression profiling identifies clinically relevant subtypes of prostate cancer.

Authors: Jacques Lapointe; Chunde Li; John P Higgins; Matt van de Rijn; Eric Bair; Kelli Montgomery; Michelle Ferrari; Lars Egevad; Walter Rayford; Ulf Bergerheim; Peter Ekman; Angelo M DeMarzo; Robert Tibshirani; David Botstein; Patrick O Brown; James D Brooks; Jonathan R Pollack
Journal: Proc Natl Acad Sci U S A Date: 2004-01-07 Impact factor: 11.205

8. Gene expression correlates of clinical prostate cancer behavior.

Authors: Dinesh Singh; Phillip G Febbo; Kenneth Ross; Donald G Jackson; Judith Manola; Christine Ladd; Pablo Tamayo; Andrew A Renshaw; Anthony V D'Amico; Jerome P Richie; Eric S Lander; Massimo Loda; Philip W Kantoff; Todd R Golub; William R Sellers
Journal: Cancer Cell Date: 2002-03 Impact factor: 31.743

9. Which missing value imputation method to use in expression profiles: a comparative study and two selection schemes.

Authors: Guy N Brock; John R Shaffer; Richard E Blakesley; Meredith J Lotz; George C Tseng
Journal: BMC Bioinformatics Date: 2008-01-10 Impact factor: 3.169

10. Influence of microarrays experiments missing values on the stability of gene groups by hierarchical clustering.

Authors: Alexandre G de Brevern; Serge Hazout; Alain Malpertuy
Journal: BMC Bioinformatics Date: 2004-08-23 Impact factor: 3.169

23 in total

1. Selected Reaction Monitoring Mass Spectrometry for Absolute Protein Quantification.

Authors: Nathan P Manes; Jessica M Mann; Aleksandra Nita-Lazar
Journal: J Vis Exp Date: 2015-08-17 Impact factor: 1.355

2. Comparative analysis of codon usage bias in Crenarchaea and Euryarchaea genome reveals differential preference of synonymous codons to encode highly expressed ribosomal and RNA polymerase proteins.

Authors: Vishwa Jyoti Baruah; Siddhartha Sankar Satapathy; Bhesh Raj Powdel; Rocktotpal Konwarh; Alak Kumar Buragohain; Suvendra Kumar Ray
Journal: J Genet Date: 2016-09 Impact factor: 1.166

3. Imputing missing RNA-sequencing data from DNA methylation by using a transfer learning-based neural network.

Authors: Xiang Zhou; Hua Chai; Huiying Zhao; Ching-Hsing Luo; Yuedong Yang
Journal: Gigascience Date: 2020-07-01 Impact factor: 6.524

4. A flexible, interpretable, and accurate approach for imputing the expression of unmeasured genes.

Authors: Christopher A Mancuso; Jacob L Canfield; Deepak Singla; Arjun Krishnan
Journal: Nucleic Acids Res Date: 2020-12-02 Impact factor: 16.971

Review 5. Review, evaluation, and discussion of the challenges of missing value imputation for mass spectrometry-based label-free global proteomics.

Authors: Bobbie-Jo M Webb-Robertson; Holli K Wiberg; Melissa M Matzke; Joseph N Brown; Jing Wang; Jason E McDermott; Richard D Smith; Karin D Rodland; Thomas O Metz; Joel G Pounds; Katrina M Waters
Journal: J Proteome Res Date: 2015-04-22 Impact factor: 4.466